vertica integration with apache kafka error - apache-kafka

Could you please help me with the following issue?
I have an installed vertica 9.3 on one host and installed apache kafka on second host. I want integration vertica with apache kafka (2.4.0). I configure by official vertica's guide, but, when I try to make source:
vkconfig source --create --cluster kafka_weblog --source test --partitions 1 --conf /opt/vertica/packages/kafka/config/my.conf
I get error:
Exception in thread "main" com.vertica.solutions.kafka.exception.ConfigurationException: ERROR: [[Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_3_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]]
at com.vertica.solutions.kafka.model.StreamSource.validateConfiguration(StreamSource.java:248)
at com.vertica.solutions.kafka.model.StreamSource.setFromMapAndValidate(StreamSource.java:194)
at com.vertica.solutions.kafka.model.StreamModel.<init>(StreamModel.java:93)
at com.vertica.solutions.kafka.model.StreamSource.<init>(StreamSource.java:44)
at com.vertica.solutions.kafka.cli.SourceCLI.getNewModel(SourceCLI.java:62)
at com.vertica.solutions.kafka.cli.SourceCLI.getNewModel(SourceCLI.java:13)
at com.vertica.solutions.kafka.cli.CLI.run(CLI.java:59)
at com.vertica.solutions.kafka.cli.CLI._main(CLI.java:141)
at com.vertica.solutions.kafka.cli.SourceCLI.main(SourceCLI.java:29)
Caused by: java.sql.SQLNonTransientException: [Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_3_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.executeQuery(Unknown Source)
at com.vertica.solutions.kafka.model.StreamSource.validateConfiguration(StreamSource.java:227)
... 8 more
Caused by: com.vertica.support.exceptions.NonTransientException: [Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_3_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]
If I show table weblog_sched.stream_clusters, then there is localhost:9092 in the column hosts, but not my ip-address of my kafka server (192.168.0.8), although when I created the cluster I specified the address of the kafka server:
vkconfig cluster --create --cluster kafka_weblog --hosts 192.168.0.8:9092 --conf /opt/vertica/packages/kafka/config/my.conf
Why is that? (I think, that this error is associated with an incorrect entry in weblog_sched.stream_clusters)

There are few ways of debugging this..
Try telnet 192.168.0.8 9092. If you are able to telnet, then the port is open and Kafka is running and is accessible from the place where you are executing the telnet command. If you are not able to connect via telnet, then you can try this..
Check your firewall or iptables rules and allow 9092 port on that machine.
Ex: iptables -A INPUT -p tcp --dport 9092 -j ACCEPT
If you are still not able to access it, try disabling or allowing that port in your firewall in your system (like ufw, firewalld) etc'
Check your advertised.listeners broker configuration. Ensure that you have set it to PLAINTEXT://192.168.0.8:9092 (if you are using the PLAINTEXT). This most likely may solve your problem.

Make sure to add Kafka hosts in /etc/hosts file of Vertica Cluster. The issue should be resolved. If this is the case, then it means that your advertised.listeners have been configured incorrectly.
Hardcoding your hosts under /etc/hosts is definitely not the best approach. You must fix your advertised.listeners as a more permanent solution.

Related

How to configure Kafka for Clickhouse whithout an error being thrown?

I have two Kafka clusters, the first — Kafka-A — uses a "SASL SCRAM-SHA-256" mechanism to authenticate,the other — Kafka-B — has no configuration set for it.
To be able to connect to Kafka-A in Clickhouse, I configured a config.xml file as demonstrated bellow:
My config.xml configuration:
<kafka>
<security_protocol>sasl_plaintext</security_protocol>
<sasl_mechanism>SCRAM-SHA-256</sasl_mechanism>
<sasl_username>xxx</sasl_username>
<sasl_password>xxx</sasl_password>
<debug>all</debug>
<auto_offset_reset>latest</auto_offset_reset>
<compression_type>snappy</compression_type>
</kafka>
At this point I found that I can't connect to Kafka-B using the Kafka engine table. When I try to an error occurs that prints the following message:
StorageKafka (xxx): [rdk:FAIL]
[thrd:sasl_plaintext://xxx/bootstrap]:
sasl_plaintext://xxx/bootstrap: SASL SCRAM-SHA-256
mechanism handshake failed: Broker: Request not valid in current SASL
state: broker's supported mechanisms: (after 3ms in state
AUTH_HANDSHAKE, 4 identical error(s) suppressed)
It seems when connecting to Kafka-B, Clickhouse also use the SASL authentication, which leads to the error being thrown, since Kafka-B servers are not configured with authentication.
I would like to know how I can configure it correctly to connect to different Kafka clusters?
CH allows to define kafka config for each topic
Use topic in the name of an XML section:
<kafka_mytopic>
<security_protocol>....
....
</kafka_mytopic>

Kafka Sink: ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone:130)

Am trying to stream data from one stream file to another file. It was working earlier and suddenly it providing the error as ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone:130). Have restarted the zookeeper, kafka-server, schema-registry, source and sink connectors, but still am facing same issue and unable to resolve it. Any suggestion would be helpful.
Source connector:
name=local-file-source
connector.class=FileStreamSource
tasks.max=1
file=/home/jimmacaulay/Desktop/ETL/Kafka/confluent-5.5.1/data/data/Jim_Source.csv
topic=Jim
Sink connector:
name=local-file-sink
connector.class=FileStreamSink
tasks.max=1
file=/home/jimmacaulay/Desktop/ETL/Kafka/confluent-5.5.1/data/data/Jim_Sink.csv
topics=Jim
Error:
[2020-08-18 06:25:50,482] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone:130)
org.apache.kafka.connect.errors.ConnectException: Unable to initialize REST server
at org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:217)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:87)
Caused by: java.io.IOException: Failed to bind to 0.0.0.0/0.0.0.0:8083
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:346)
at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:307)
at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:231)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
at org.eclipse.jetty.server.Server.doStart(Server.java:385)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
at org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:215)
... 1 more
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85)
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:342)
... 8 more
Resolved the error by starting the connect-standalone using source and sink properties together.
sh connect-standalone ../config/connect-avro-standalone.properties ../config/connect-file-source_Topic_Jim.properties ../config/connect-file-sink_Topic_Jim.properties
Earlier i was starting it separately as below,
sh connect-standalone ../config/connect-avro-standalone.properties ../config/connect-file-source_Topic_Jim.properties
sh connect-standalone ../config/connect-avro-standalone.properties ../config/connect-file-sink_Topic_Jim.properties
Cause of the issue,
When am starting separately connect-standalone is getting started first for source properties using the port number 8083. Again when am starting the sink properties, it tries to use same port number and failes.
Solutions,
Both source and sink properties should be while starting connect-standalone which shares the same port.
Or define the different port numbers in the properties file and start it separately

Kafka Failed to create new KafkaAdminClient on Kerberized Cluster

i have a kerberized cluster with Kafka on it.
I want to use Confluent Schema Registry with Kafka on cluster.
Launching the Schema Registry from my local pc, everything works just fine.
But when i uploaded it on a machine in the cluster and tried to run from it i get:
Error
ERROR Server died unexpectedly: (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain:50)
org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
...
Caused by: org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: Message stream modified (41)
...
Caused by: javax.security.auth.login.LoginException: Message stream modified (41)
I also tried to run it from another machine on the cluster and i get the same result.
Schema-registry.properties
listeners=http://0.0.0.0:8081
kafkastore.connection.url=master01.domain.ext:2181,master02.domain.ext:2181
kafkastore.bootstrap.servers=SASL_PLAINTEXT://xxx.domain.ext:6667,SASL_PLAINTEXT://xxx.domain.ext:6667
kafkastore.topic=_schemas
debug=true
kafkastore.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="/path/schema-registry/etc/schema-registry/my-keytab.keytab" \
principal="kafka/xxx.domain.ext#DOMAIN.EXT";
kafkastore.sasl.kerberos.service.name=kafka
kafkastore.security.protocol=SASL_PLAINTEXT
kafkastore.sasl.mechanism=GSSAPI
EXECUTION COMMAND:
sudo bash bin/schema-registry-start ./etc/schema-registry/schema-registry.properties
QUESTIONS:
Why it works in my local pc and not on a machine on cluster?
What should i change?
P.S. i get the same result even trying to run the CMAK yahoo kafka-manager ( using same jaas.confing and same keytab )

While I'm trying to run apache atlas. I'm facing some hbase error(I'm using embedded hbase and solr)

My apache atlas server is started but I found errors in my application.log file.
ui for apache atlas is also not running.
I've followed each and every step from apache website. All went good.
I gave all permissions in atlas-env.sh and application-properties files.
can anyone help me to how to figure it out?
Running setup per configuration atlas.server.run.setup.on.start. (SetupSteps$SetupRequired:186)
2019-10-25 12:25:49,366 WARN - [main:] ~ Running setup per configuration atlas.server.run.setup.on.start. (SetupSteps$SetupRequired:186)
2019-10-25 12:25:50,104 WARN - [main:] ~ Retrieve cluster id failed (ConnectionImplementation:551)
java.util.concurrent.ExecutionException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/hbaseid
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:549)
at org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:287)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:219)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:114)
at org.janusgraph.diskstorage.hbase2.HBaseCompat2_0.createConnection(HBaseCompat2_0.java:46)
at org.janusgraph.diskstorage.hbase2.HBaseStoreManager.<init>(HBaseStoreManager.java:314)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:476)
at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:408)
at org.janusgraph.graphdb.configuration.GraphDatabaseC
When HBase starts, HBase Master node creates the node "/hbase/hbaseid" in zookeeper.
1. Check the processes.
check HBase and zookeeper are running or not with 'jps -m'.
If you configured HBase manages zookeeper internally, you can not see the zookeeper process with jps command then you can check its port with 'netstat -nt | grep ZK_PORT' and normally it uses 2181.
netstat -nt | grep 2181
2. Check the zookeeper node
If you run zookeeper cluster independently, you can check the node "/hbase/hbaseid" with the zookeeper CLI like this.
ZOOKEEPER/bin/zkCli.sh
[zk: ...] ls /
[zk: ...] get /hbase/hbaseid
I hope this could help you.
install atlas
you can download the source code of atlas v2.0.0 or master branch from here and build it.
$ export MAVEN_OPTS="-Xms2g -Xmx2g"
$ mvn clean install
$ mvn clean package -Pdist
If you build the master branch, you can find the server package from /SOURCE_CODE/distro/target/apache-atlas-3.0.0-SNAPSHOT-server/apache-atlas-3.0.0-SNAPSHOT.
You should configure the server prior to run it.
Here are the minimum settings. Please find the atlas-application.properties file in conf directory.
atlas.graph.storage.hostname=xxx.xxx.xxx.xxx:xxxx => zookeeper addr and port for hbase
atlas.graph.index.search.backend=[solr or elasticsearch] => choose one you want to use.
atlas.graph.index.hostname=xxx.xxx.xxx.xxx => solr or elasticsearch server's addr
atlas.kafka.zookeeper.connect=xxx.xxx.xxx.xxx:xxxx => zookeeper addr and port for Kafka
atlas.kafka.bootstrap.servers=xxx.xxx.xxx.xxx:xxxx => kafka addr
atlas.audit.hbase.zookeeper.quorum=xxx.xxx.xxx.xxx:xxxx => zookeeper addr and port for hbase
To run the server,
$ bin/atlas_start.py
install zookeeper
Actually, to install zookeeper, there is almost nothing to do.
just follow the steps
In this case, you should change your hbase env. in hbase-env.sh
export HBASE_MANAGES_ZK=false
If you see some warnings from hbase log file like 'Could not start ZK at requested port of 2181.' then please check the hbase-site.xml file and set hbase.cluster.distributed to true.

Kafka + Kubernetes + Helm + `/usr/bin/kafka-avro-console-consumer`?

How do I use the standard kafka-avro-console-consumer tool with Kafka running via the Confluent Helm Charts? The confluentinc/cp-kafka:5.0.0 image recommended for running cli utilities doesn't contain kafka-avro-console-consumer.
If I shell into the schema-registry pod to use kafka-avro-console-consumer
kubectl exec -it my-confluent-oss-cp-schema-registry-6c8546c86d-pjpmd -- /bin/bash
/usr/bin/kafka-avro-console-consumer --bootstrap-server my-confluent-oss-cp-kafka:9092 --topic my-test-avro-records --from-beginning
Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: 5555; nested exception is:
java.net.BindException: Address already in use (Bind failed)
sun.management.AgentConfigurationError: java.rmi.server.ExportException: Port already in use: 5555; nested exception is:
java.net.BindException: Address already in use (Bind failed)
at sun.management.jmxremote.ConnectorBootstrap.startRemoteConnectorServer(ConnectorBootstrap.java:480)
at sun.management.Agent.startAgent(Agent.java:262)
at sun.management.Agent.startAgent(Agent.java:452)
Caused by: java.rmi.server.ExportException: Port already in use: 5555; nested exception is:
java.net.BindException: Address already in use (Bind failed)
java.rmi.server.ExportException: Port already in use: 5555;
Sounds like you've enabled JMX as part of that container via the KAFKA_JMX_PORT variable.
If that's the case, you will need to temporarily override that by exporting (or unset) it within the shell session to a different value before running any other Kafka scripts