Debezium io with pulsar - debezium

I want to understand how pulsar uses debezium io connect for CDC.
While creating the source using pulsar-admin source create, how can I pass broker url and authentication params or client. Similar to what we di when using localrun.
The cmd I run :
bin/pulsar-admin source localrun --sourceConfigFile debezium-mysql-source-config.yaml --client-auth-plugin --client-auth-params --broker-service-url
Now I want to replace this to create a connector which runs in cluster mode.

Localrun is a special mode that simplifies debugging and it runs outside of normal cluster. It needs extra parameters to create the client for the local runtime.
In the cluster mode the connector will get the client from the Pulsar connectors runtime/through the function worker configuration. All you need to do is use "bin/pulsar-admin source create ...".

Related

ApacheBeam on FlinkRunner doesn't read from Kafka

I'm trying to run Apache Beam backed by a local Flink cluster in order to consume from a Kafka Topic, as described in the Documentation for ReadFromKafka.
The code is basically this pipeline and some other setups as describe in the Beam Examples
with beam.Pipeline() as p:
lines = p | ReadFromKafka(
consumer_config={'bootstrap.servers': bootstrap_servers},
topics=[topic],
) | beam.WindowInto(beam.window.FixedWindows(1))
output = lines | beam.FlatMap(lambda x: print(x))
output | WriteToText(output)
Since I attempted to run on Flink, I followed this doc for Beam on Flink and did the following:
--> I download the binaries for flink 1.10 and followed these instructions to proper setup the cluster.
I checked the logs for the server and task instance. Both were properly initialized.
--> Started kafka using docker and exposing it in port 9092.
--> Executed the following in the terminal
python example_1.py --runner FlinkRunner --topic myTopic --bootstrap_servers localhost:9092 --flink_master localhost:8081 --output output_folder
The terminal outputs
2.23.0: Pulling from apache/beam_java_sdk Digest: sha256:3450c7953f8472c2312148a2a8324b0115fd71b3a7a01a3b017f6de69d89dfe1 Status: Image is up to date for apache/beam_java_sdk:2.23.0 docker.io/apache/beam_java_sdk:2.23.0
But then after writing some messags to myTopic, the terminal remains frozen and I don't see anything in the output folder. I checked flink-conf.yml and given these two lines
jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
I assumed that the port for the jobs would be 6123 instead of 8081 as specified in beam documentation, but the behaviour for both ports is the same.
I'm very new to Beam/Flink, so I'm not quite sure that it can be, I have two hypothesis as of now, but can't quite figure out how to investigate'em:
Something related to the port that Beam communicates with Flink in order to send the jobs.
2.The Expansions Service for Python SDK mentioned in the apache.beam.io.external.ReadFromKafka docs
Note: To use these transforms, you need to start a Java Expansion Service. Please refer to the portability documentation on how to do that. Flink Users can use the built-in Expansion Service of the Flink Runner’s Job Server. The expansion service address has to be provided when instantiating the transforms.
But reading the portability documentation, it refers me back to the same doc for Beam on Flink.
Could someone, please, help me out?
Edit: I was writing to the topic using Debezium Source Connector for PostgreSQL and seeing the behavior mentioned above. But when I tried to the topic manually, the application crashed with the following
RuntimeError: org.apache.beam.sdk.util.UserCodeException: org.apache.beam.sdk.coders.CoderException: cannot encode a null byte[]
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
You are doing everything correctly; the Java Expansion Service no longer needs to be started manually (see the latest docs). Also, Flink serves the web UI at 8081, but accepts job submission there just as well, so either port works fine.
It looks like you may be running into the issue that Python's TextIO does not yet support streaming.
Additionally, there is the complication that when running Python pipelines on Flink, the actual code runs in a docker image, and so if you are trying to write to a "local" file it will be a file inside the image, not on your machine.

Programmatically create Artemis cluster on remote server

Is it possible to programmatically create/update a cluster on a remote Artemis server?
I will have lots of docker instances and would rather configure on the fly than have to set in XML files if possible.
Ideally on app launch I'd like to check if a cluster has been set up and if not create one.
This would probably involve getting the current server configuration and updating it with the cluster details.
I see it's possible to create a Configuration.
However, I'm not sure how to get the remote server configuration, if it's at all possible.
Configuration config = new ConfigurationImpl();
ClusterConnectionConfiguration ccc = new ClusterConnectionConfiguration();
ccc.setAddress("231.7.7.7");
config.addClusterConfiguration(ccc);
// need a way to get and update the current server configuration
ActiveMQServer.getConfiguration();
Any advice would be appreciated.
If it is possible, is this a good approach to take to configure on the fly?
Thanks
The org.apache.activemq.artemis.core.config.impl.ConfigurationImpl object can be used to programmatically configure the broker. The broker test-suite uses this object to configure broker instances. However, this object is not available in any remote sense.
Once the broker is started there is a rich management API you can use to add things like security settings, address settings, diverts, bridges, addresses, queues, etc. However, the changes made by most (although not all) of these operations are volatile which means many of them would need to be performed every time the broker started. Furthermore, there are no management methods to add cluster connections.
You might consider using a tool like Ansible to manage the configuration or even roll your own solution with a templating engine like FreeMarker to customize the XML and then distribute it to your Docker instances using some other technology.

Directly connecting jaeger client to remote collector using kafka as intermediate buffer

I am trying to connect to jaeger collector which uses Kafka as intermediate buffer.
Here are my doubts could any one please point to some docs .
QUESTION
1. How to connect to collector by skipping agent and use kafka as intermediate buffer.Please provide me command or configuration
2. Whats the configuration for kafka to connect to particular host.When I tried to use below command its still pointing to localhost and failing
docker run -e SPAN_STORAGE_TYPE=kafka jaegertracing/jaeger-collector:1.17
```{"level":"fatal","ts":1585063279.3705006,"caller":"collector/main.go:70","msg":"Failed to init storage factory","error":"kafka: client has run out of available brokers to talk to (Is your cluster reachable?)","stacktrace":"main.main.func1\n\tgithub.com/jaegertraci```
Please provide me some sample example so that I can go through,...

Elixir Kafka client Elsa

I am trying to create dynamically topics in Kafka but unfortunately some error occurs. Here is my code
def hello_from_elsa do
topic = "producer-manager-test"
connection = :conn
Elsa.Supervisor.start_link(endpoints: #endpoints,
connection: connection)
Elsa.create_topic(#endpoints, topic)
end
As far as I understand I can connect to the broker itself but when the crete topic line is executed i get this error:
(MatchError) no match of right hand side value: false
(kafka_protocol) src/kpro_brokers.erl:240: anonymous fn/1 in :kpro_brokers.discover_controller/2
(kafka_protocol) src/kpro_lib.erl:376: :kpro_lib.do_ok_pipe/2
(kafka_protocol) src/kpro_lib.erl:281: anonymous fn/3 in :kpro_lib.with_timeout/2
I am not sure whether i miss some additional step before creating the topic. But it should be fine I guess since i start the supervisor and its running :/
Hard to say since the error is coming from the underlying Kafka protocol and not Elsa directly but it looks like there aren't any Kafka cluster controllers able to be found.
Topic management has to be done through a controller node so the with_connection function create_topic wraps explicitly passes the atom :controller to establish the connection and for whatever reason, likely something specific to your cluster, the function isn't able to successfully find a controller.
What type of cluster are you testing against? If you use the divo and divo_kafka library you can stand up a single-node kafka cluster using Docker on your local host to test against and it should work as expected.

Safely give secret/token to Kafka Connector?

We are using Kafka Connectors (JDBC and others), and configuring them using the REST API (using curl in shell scripts). Right now, when testing/developing, we are including secrets (for the JDBC connect - database user/pw) directly in the request. This is obviously bad, as those are then readily available for everybody to see when reading them out using the GET request.
Is there a good way to give secrets to the connectors? We can bring them in safely using environment variables or config files (injected fom OpenShift) - but is there a syntax available when starting a connector via the REST API for that?
EDIT: This is for the distributed mode of connectors; i.e., configuration by REST API, not connector config files...
A pluggable interface for this was implemented in Apache Kafka 2.0 through KIP-297. You can see more details in the documented example here.