"SchemaRegistryException: Failed to get Kafka cluster ID" for LOCAL setup - apache-kafka

I'm downloaded the .tz (I am on MAC) for confluent version 7.0.0 from the official confluent site and was following the setup for LOCAL (1 node) and Kafka/ZooKeeper are starting fine, but the Schema Registry keeps failing (Note, I am behind a corporate VPN)
The exception message in the SchemaRegistry logs is:
[2021-11-04 00:34:22,492] INFO Logging initialized #1403ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2021-11-04 00:34:22,543] INFO Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
[2021-11-04 00:34:22,614] INFO Adding listener: http://0.0.0.0:8081 (io.confluent.rest.ApplicationServer)
[2021-11-04 00:35:23,007] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryException: Failed to get Kafka cluster ID
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1488)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:166)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:71)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:90)
at io.confluent.rest.Application.configureHandler(Application.java:271)
at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:245)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:44)
Caused by: java.util.concurrent.TimeoutException
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:180)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1486)
... 7 more
My schema-registry.properties file has bootstrap URL set to
kafkastore.bootstrap.servers=PLAINTEXT://localhost:9092
I saw some posts saying its the SchemaRegistry unable to connect to the KafkaCluster URL because of the localhost address potentially. I am fairly new to Kafka and basically just need this local setup to run a git repo that is utilizing some Topics/Kafka so my questions...
How can I fix this (I am behind a corporate VPN but I figured this shouldn't affect this)
Do I even need the SchemaRegistry?

I ended up just going with the Docker local setup inside, and the only change I had to make to the docker compose YAML was to change the schema-registry port (I changed it to 8082 or 8084, don't remember exactly but just an unused port that is not being used by some other Confluent service listed in the docker-compose.yaml) and my local setup is working fine now

Related

Kafka server running into java.net.SocketException: Invalid argument exception

I am trying to start up Kafka server locally on a macos with m1 chip. I followed the guide from the official kakfa quickstart(https://kafka.apache.org/quickstart). Zookeeper starts up fine but bin/kafka-server-start.sh config/server.properties is giving me socket invalid argument exception below:
[2023-01-30 09:22:55,790] ERROR Encountered an error while configuring the connection, closing it. (kafka.network.DataPlaneAcceptor)
java.net.SocketException: Invalid argument
at java.base/sun.nio.ch.Net.setIntOption0(Native Method)
at java.base/sun.nio.ch.Net.setSocketOption(Net.java:373)
at java.base/sun.nio.ch.SocketChannelImpl.setOption(SocketChannelImpl.java:234)
at java.base/sun.nio.ch.SocketAdaptor.setBooleanOption(SocketAdaptor.java:270)
at java.base/sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:305)
at kafka.network.Acceptor.configureAcceptedSocketChannel(SocketServer.scala:759)
at kafka.network.Acceptor.accept(SocketServer.scala:737)
at kafka.network.Acceptor.acceptNewConnections(SocketServer.scala:703)
at kafka.network.Acceptor.run(SocketServer.scala:645)
at java.base/java.lang.Thread.run(Thread.java:829)
I have tried:
Double checking that no other application is using the same port
Use a different JDK (from openjdk17 to openjdk 11 and back to 17)
Rebooted my machine
Clear up kafka related log folder under /tmp
Rebooted my machine
Used a lower version (3.2.1) of kafka tarball (since that one worked for me before but now it also runs into the same socket issue)
Change zookeeper port from 2181 to something else
Turns out to be an antivirus issue. Sorry for the false alarm.

Keycloak cluster fails on Amazon ECS (org.infinispan.commons.CacheException: Initial state transfer timed out for cache)

I am trying to deploy a cluster of 2 Keycloak docker images (6.0.1) on Amazon ECS (Fargate) using the built-in ECS Service Discovery mecanism (using DNS_PING).
Environment:
JGROUPS_DISCOVERY_PROTOCOL=dns.DNS_PING
JGROUPS_DISCOVERY_PROPERTIES=dns_query=my.services.internal,dns_record_type=A
JGROUPS_TRANSPORT_STACK=tcp <---(also tried udp)
The instances IP are correctly resolved from Route53 private namespace and they discover each other without any problem (x.x.x.138 is started first, then x.x.x.76).
Second instance:
[org.jgroups.protocols.dns.DNS_PING] (ServerService Thread Pool -- 58) ip-x.x.x.76: entries collected from DNS (in 3 ms): [x.x.x.76:0, x.x.x.138:0]
[org.jgroups.protocols.dns.DNS_PING] (ServerService Thread Pool -- 58) ip-x.x.x.76: sending discovery requests to hosts [x.x.x.76:0, x.x.x.138:0] on ports [55200 .. 55200]
[org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 58) ip-x.x.x.76: sending JOIN(ip-x-x-x-76) to ip-x-x-x-138
And on the first instance:
[org.infinispan.CLUSTER] (thread-8,ejb,ip-x-x-x-138) ISPN000094: Received new cluster view for channel ejb: [ip-x-x-x-138|1] (2) [ip-x-x-x-138, ip-172-x-x-x-76]
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-8,ejb,ip-x-x-x-138) Joined: [ip-x-x-x-76], Left: []
[org.infinispan.CLUSTER] (thread-8,ejb,ip-x-x-x-138) ISPN100000: Node ip-x-x-x-76 joined the cluster
[org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger-12,ejb,ip-x-x-x-76) ip-x-x-x-76: pingable_mbrs=[ip-x-x-x-138, ip-x-x-x-76], ping_dest=ip-x-x-x-138
So it seems we have a working cluster. Unfortunately, the second instance ends up failing with the following exception:
Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache work on ip-x-x-x-76
Before this occurs, I am seeing a bunch of failure discovery task suspecting/unsuspecting the opposite instance:
[org.jgroups.protocols.FD_ALL] (Timer runner-1,null,null) haven't received a heartbeat from ip-x-x-x-76 for 60016 ms, adding it to suspect list
[org.jgroups.protocols.FD_ALL] (Timer runner-1,null,null) ip-x-x-x-138: suspecting [ip-x-x-x-76]
[org.jgroups.protocols.FD_ALL] (thread-9,ejb,ip-x-x-x-138) Unsuspecting ip-x-x-x-76
[org.jgroups.protocols.FD_SOCK] (thread-9,ejb,ip-x-x-x-138) ip-x-x-x-138: broadcasting unsuspect(ip-x-x-x-76)
On the Infinispan side (cache), everything seems to occur correctly but I am not sure. Every cache is "rebalanced" and each "rebalance" seems to end up with, for example:
[org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p24-t2) Finished receiving of segments for cache offlineSessions for topology 2.
It feels like its a connectivity issue, but all the ports are wide open between these 2 instances, both for TCP and UDP.
Any idea ? Anyone successfull at configuring this on ECS (fargate) ?
UPDATE 1
The second instance was initially shutting down not because of the "Initial state transfer timed out .." error but because the health check was taking longer than the configured grace period. Nonetheless, with 2 healthy instances, I receive "404 - Not Found" once every 2 queries, telling me that there is indeed a cache problem.
In current keycloak docker image (6.0.1), the default stack is UDP. According to this, version 7.0.0 will default to TCP and will also introduce a variable to toggle the stack (JGROUPS_TRANSPORT_STACK).
Using the UDP stack in Amazon ECS will "partially" work, meaning the discovery will work, the cluster will form, but the Infinispan cache won't be able to sync between instances, which will produce erratic errors. There is probably a way to make it work "as-is", but I dont see anything blocked between the instances when checking the VPC Flow logs.
A workaround is to switch to TCP by modifying the JGroups stack directly in the image in file /opt/jboss/keycloak/standalone/configuration/standalone-ha.xml:
<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
<channels default="ee">
<channel name="ee" stack="tcp" cluster="ejb"/> <-- set stack to tcp
</channels>
Then commit the new image:
docker commit -m="TCP cluster stack" CONTAINER_ID jboss/keycloak:6.0.1-tcp-cluster
Tag/Push the image to Amazon ECR and make sure the port 7600 is accepted in your security group between your Amazon ECS tasks.

Apache Storm: Could not find leader nimbus from seed hosts

I installed Apache Storm 1.0 by following this tutorial but I am not able to access to the Storm UI from the Internet. Accessing localhost:8080 gives the following error:
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [localhost]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:90)
at org.apache.storm.ui.core$cluster_configuration.invoke(core.clj:343)
at org.apache.storm.ui.core$fn__12106.invoke(core.clj:929)
at org.apache.storm.shade.compojure.core$make_route$fn__2467.invoke(core.clj:93)
at org.apache.storm.shade.compojure.core$if_route$fn__2455.invoke(core.clj:39)
at org.apache.storm.shade.compojure.core$if_method$fn__2448.invoke(core.clj:24)
at org.apache.storm.shade.compojure.core$routing$fn__2473.invoke(core.clj:106)
at clojure.core$some.invoke(core.clj:2570)
at org.apache.storm.shade.compojure.core$routing.doInvoke(core.clj:106)
at clojure.lang.RestFn.applyTo(RestFn.java:139)
at clojure.core$apply.invoke(core.clj:632)
at org.apache.storm.shade.compojure.core$routes$fn__2477.invoke(core.clj:111)
at org.apache.storm.shade.ring.middleware.json$wrap_json_params$fn__11576.invoke(json.clj:56)
at org.apache.storm.shade.ring.middleware.multipart_params$wrap_multipart_params$fn__3543.invoke(multipart_params.clj:103)
at org.apache.storm.shade.ring.middleware.reload$wrap_reload$fn__4286.invoke(reload.clj:22)
at org.apache.storm.ui.helpers$requests_middleware$fn__3770.invoke(helpers.clj:46)
at org.apache.storm.ui.core$catch_errors$fn__12301.invoke(core.clj:1230)
at org.apache.storm.shade.ring.middleware.keyword_params$wrap_keyword_params$fn__3474.invoke(keyword_params.clj:27)
at org.apache.storm.shade.ring.middleware.nested_params$wrap_nested_params$fn__3514.invoke(nested_params.clj:65)
at org.apache.storm.shade.ring.middleware.params$wrap_params$fn__3445.invoke(params.clj:55)
at org.apache.storm.shade.ring.middleware.multipart_params$wrap_multipart_params$fn__3543.invoke(multipart_params.clj:103)
at org.apache.storm.shade.ring.middleware.flash$wrap_flash$fn__3729.invoke(flash.clj:14)
at org.apache.storm.shade.ring.middleware.session$wrap_session$fn__3717.invoke(session.clj:43)
at org.apache.storm.shade.ring.middleware.cookies$wrap_cookies$fn__3645.invoke(cookies.clj:160)
at org.apache.storm.shade.ring.util.servlet$make_service_method$fn__3351.invoke(servlet.clj:127)
at org.apache.storm.shade.ring.util.servlet$servlet$fn__3355.invoke(servlet.clj:136)
at org.apache.storm.shade.ring.util.servlet.proxy$javax.servlet.http.HttpServlet$ff19274a.service(Unknown Source)
at org.apache.storm.shade.org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:654)
at org.apache.storm.shade.org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1320)
at org.apache.storm.logging.filters.AccessLoggingFilter.handle(AccessLoggingFilter.java:47)
at org.apache.storm.logging.filters.AccessLoggingFilter.doFilter(AccessLoggingFilter.java:39)
at org.apache.storm.shade.org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1291)
at org.apache.storm.shade.org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:247)
at org.apache.storm.shade.org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:210)
at org.apache.storm.shade.org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1291)
at org.apache.storm.shade.org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:443)
at org.apache.storm.shade.org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1044)
at org.apache.storm.shade.org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:372)
at org.apache.storm.shade.org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:978)
at org.apache.storm.shade.org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.apache.storm.shade.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.apache.storm.shade.org.eclipse.jetty.server.Server.handle(Server.java:369)
at org.apache.storm.shade.org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:486)
at org.apache.storm.shade.org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:933)
at org.apache.storm.shade.org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:995)
at org.apache.storm.shade.org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.apache.storm.shade.org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.apache.storm.shade.org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.apache.storm.shade.org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
at org.apache.storm.shade.org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at org.apache.storm.shade.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.apache.storm.shade.org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Content of storm.yaml:
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "localhost"
storm.local.dir: "/var/storm"
nimbus.host: "localhost"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
I resolved the two problems by myself.
for the first problem:
I had to restart zookeeper after installing apache storm.
for the second problem:
the problem was not a problem of storm.
the cause of this problem is due to the platform of azure, the 8080 port was closed by default.
So, I thank myself for this effort.
If it were allowed, I will give you (myself) +1M points
I had the same error and the answer I needed is not here, so here I am.
Before going further, I am using Storm version 2.1.0. In this version nimbus.host has been replaced by the array option nimbus.seeds.
In the log is written Did you specify a valid list of nimbus hosts for config nimbus.seeds?. As nimbus.seeds is a mandatory option, to fix the problem I simply added the host IP address as the only element in this list:
nimbus.seeds: ["HOST IP"]

The Datastax cassandra community server 2.1.10 service on local computer started and then stopped

I am trying to configure a two node cluster with cassandra in windows r2 2008
So i installed cassandra community version in one server (10.xxx.0.1,10.xxx.0.2)
And then I stopped the service and then edited the configuraton.yaml file in the conf folder.
The changes are:
cluster_name
commented the num_tokens
gave the tokens in initial_token,
seeds as 10.xxx.0.1,10.xxx.0.2,
listen_addresses are their respective ip addresses which are 10.xxx.0.1,10.xxx.0.2,
rpc_addresses as 0.0.0.0,
endpointsnitch as gossip
I also changed the cassandra rackdc.properties file to dc=DC1 rack=RAC1.
I then saved and started back the service and opened the cqlsh, but it is not connecting. Below is the error:
2015-10-12 16:20:13 Commons Daemon procrun stderr initialized
If rpc_address is set to a wildcard address (0.0.0.0), then you must set broadcast_rpc_address to a value other than 0.0.0.0
Fatal configuration error; unable to start. See log for stacktrace.
..
ERROR 21:20:14 Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: If rpc_address is set to a wildcard address (0.0.0.0), then you must set broadcast_rpc_address to a value other than 0.0.0.0
at org.apache.cassandra.config.DatabaseDescriptor.applyAddressConfig(DatabaseDescriptor.java:285) ~[apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:443) ~[apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:136) ~[apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168) [apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:562) [apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:651) [apache-cassandra-2.1.10.jar:2.1.10]
If you out 0.0.0.0 to the rpc_address you have to change the broadcast_rpc_address like in http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html , I think that the right broadcast_rpc_address can be the own ip address.

Cassandra Channel has been closed

We have a small test cluster with 3 nodes on Amazon. Everything seems working with cqlsh. But when I try to debug my app from my laptop (outside of Amazon of course), I'm getting 'Channel has been closed' errors, and it starts retrying forever. I know it's likely caused by the config in cassandra.ymal, as it shows some private IPs in my Eclipse console. Tried many different ways but still getting the same problem. Appreciate any input on this. How to get rid of the private IPs 10.251.x.x from the client?
Here are some context,
Versions:
[cqlsh 4.0.1 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
cassandra-driver-core-2.0.0-rc1.jar
In cassandra.ymal:
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "54.203.x.x,54.203.x.y"
listen_address: 10.251.a.b
broadcast_address: 54.203.x.x
native_transport_port: 9042
endpoint_snitch: Ec2MultiRegionSnitch
In Eclipse console:
DEBUG [main] (ControlConnection.java:145) - [Control connection] Successfully connected to /54.203.x.x
DEBUG [Cassandra Java Driver worker-0] (Session.java:379) - Adding /54.203.x.x to list of queried hosts
DEBUG [Cassandra Java Driver worker-1] (Session.java:379) - Adding /10.251.a.c to list of queried hosts
DEBUG [Cassandra Java Driver worker-1] (Connection.java:103) - [/10.251.a.c-1] Error connecting to /10.251.a.c (connection timed out: /10.251.a.c:9042)
DEBUG [Cassandra Java Driver worker-1] (Session.java:390) - Error creating pool to /10.251.a.c ([/10.251.a.c] Cannot connect)
DEBUG [Cassandra Java Driver worker-1] (Cluster.java:1064) - /10.251.a.c is down, scheduling connection retries
DEBUG [New I/O worker #4] (Connection.java:194) - Defuncting connection to /10.251.a.c
com.datastax.driver.core.TransportException: [/10.251.a.b] Channel has been closed
at com.datastax.driver.core.Connection$Dispatcher.channelClosed(Connection.java:548)
...
It seem that your Java driver is using auto discovery by calling "describe cluster" to get a list of all nodes in your cluster. In AWS using Ec2Snitch, that yields to private ips which obviously won't work from outside of AWS. There is a discussion on this topic here:
https://datastax-oss.atlassian.net/browse/JAVA-145
The last commend got my attention. It says you can do something with LoadBalancingPolicy of the driver to limit the nodes. Hope this includes specifying the specific IPs so it does not auto discover.