Kafka not starting from Confluent cli - apache-kafka

trying to start kafka
this is the command i am using
arch -x86_64 confluent local services start
I have just change the server properties file (host name changed)
below is the error (MAC) ..
#Using CONFLUENT_CURRENT: /var/folders/8j/n/T/confluent.200950
Starting ZooKeeper ZooKeeper is [UP]
Starting Kafka Error: Kafka failed to start
http://www.apache.org/licenses/LICENSE-2.0
#
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
#
This configuration file is intended for use in ZK-based mode, where Apache ZooKeeper is required.
#
Server Basics #############################
The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
Socket Server Settings #############################
The address the socket server listens on. If not configured, the host name will be equal to the value of
FORMAT:
listeners = listener_name://host_name:port
EXAMPLE:
listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://s-hq-macpro3331.local:9092
Listener name, hostname and port the broker will advertise to clients.
If not set, it uses the value for "listeners".
advertised.listeners=PLAINTEXT://s-hq-macpro3331.local:9092
Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3
The number of threads that the server uses for processing requests, which may include disk I/O
#num=8
The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

It appears you've removed leading # from the property file commentary, therefore Kafka will fail to start.
By default, it listens locally. You only need to modify listeners config (not hostname) if you are setting up a cluster (which the command doesn't do)
Plus, on a Mac, you can brew install kafka. There's nothing specific Confluent Platform is needed for, here. And it should work without arch emulation.
I'm sure why you previously tagged google-cloud-storage, but worth mentioning that that command is not recommended for production usage.
Within GCP, you could use GKE w/ Strimzi, or deploy Confluent Platform from the marketplace, or simply use Confluent Cloud and link to your account.

Related

Kafka Post Deployment - Handling ever-growing clients

We have setup a Kafka Cluster for High Availability and distributed data load. The current consumers and producers specify all the broker IP addresses to connect to the cluster. In the future, there will be the need to continuosly monitor the cluster and add a new broker based on collected metrics and overall system performance. In case a broker crashes, as soon as possible we have to add a new broker with a different IP.
In these scenarios, we have to change all client configurations, a time consuming and stressful operation.
I think we can setup a Config Server (e.g. Spring Cloud Config Server) to specify all the broker IP addresses in a centralized manner, so we have to change all in one place, without touching all the clients, but I don't know if it is the best approach. Obviously, the clients must be programmed to get broker list from config server.
There's a better approach?
Worth pointing out that the "bootstrap" process doesn't require giving every single broker address to the clients, really only the first available address in the list is used for the initial connection, then the advertised.listeners on all the broker configs in the cluster, is what the clients actually use
The answer to your question is to use service discovery, yes. That could be Spring Could Config, but the more general option would be Hashicorp Consul or other service that uses DNS (Kubernetes uses CoreDNS, by default, for example, or AWS Route53).
Then you edit the /etc/resolv.conf of each machine (assuming Linux) the client is running on to include the DNS servers, and you can simply refer to kafka.your.domain:9092 rather than using IP addresses
You could use a load balancer (with a friendly dns like kafka.domain.com), which points to all of your brokers. We do this in our environment. Your clients then connect to kafka.domain.com:9092.
As soon as you add new brokers, you only change the load balancer endpoints and not the client configuration.
Additionally please note that you only need to connect to one bootstrap broker and don't have to list all of them in the client configuration.

ArtemisMQ Connector

I'm new to ArtemisMQ and absolutely don't understand the sense of connectors.
Why is connector essential, as we already specify accepter of Broker Server in broker.xml -> we know which port (it is accepter port) to send a request to if we want to connect to this server. Even if this server is part of cluster, what is a role of connector? There is also information from other part of documentation about "Clusters", but there is words about cluster connections :
The cluster is formed by each node declaring cluster connections to other nodes in the core configuration file broker.xml. When a node forms a cluster connection to another node, internally it creates a core bridge (as described in Core Bridges) connection between it and the other node, this is done transparently behind the scenes - you don't have to declare an explicit bridge for each node. These cluster connections allow messages to flow between the nodes of the cluster to balance load.
From documentation "Understanding Connectors":
connectors are used by a client to define how it connects to a server.
What does it mean "define how"?
I've already read and another question about connector, but it doesn't help me.
Additional questions:
Is connector always the same as acceptor(I've downloaded some official examples and all of them(that i've seen) have both same acceptor and connector )?
What information does connector encapsulates, if it only consists of host+port (and it is same as acceptor's (if we omit that acceptor host can me 0.0.0. or localhost))?
Why does stand-alone Broker have connector, for example by default creation ./artemis create?
What should we write in connector?
Can you give a simple example when acceptor and connector are
different?
Two important points to note:
A connector is not essential depending on your use-case. You'll find that the default broker.xml doesn't have any connector elements defined. For example, if you just run ./artemis create the generated broker.xml will not have any connector elements.
The documentation you cited is quite old (from the very first release of Artemis). You may benefit from reading the latest documentation which has been updated for clarity in many places.
As noted in both the documentation and the other Stack Overflow answer you cited, certain components in the broker need to connect to other brokers (e.g. core bridges, cluster-connections, etc.). A connector encapsulates the information necessary for these other components to make the connections they need. It's really as simple as that.
Now regarding your individual questions...
Even if this server is part of cluster, what is a role of connector?
In the case of a cluster using a broadcast-group and a discovery-group each node in the cluster needs to broadcast to all the other nodes in the cluster how the other nodes can connect to itself. It does this by broadcasting a connector which is referenced in the cluster-connection configuration. When the other nodes in the cluster receive this broadcast they take the connector information and use it to connect back to the node which broadcast it originally. In this way nodes can dynamically discover and connect to each other. It's also worth noting that in this case the connector configuration will essentially mirror one of the broker's acceptor configurations (since the connector will be used by other nodes to connect to the broadcasting node's acceptor). This is discussed further in the cluster documentation.
...connectors are used by a client to define how it connects to a server...
This bit of documentation you quoted is accurate but may be a bit confusing. Keep in mind that that a client can run anywhere, even within the broker itself. In the case of core bridges and cluster connections there is a client running in the broker which use the connector to determine how to connect to another broker. For what it's worth the updated documentation doesn't have this specific wording.
What does it mean "define how"?
A connector is the URL that the client needs to connect to the broker. The URL can simply include the host and port or it can contain lots of configuration details for the connection (e.g. SSL config).
Is connector always the same as acceptor..?
No, not always. In the case of a cluster they will be the same (or very close) for the reasons I already outlined, but in the case of a bridge they won't be the same.
What information does connector encapsulates..?
See above.
Why does stand-alone Broker have connector, for example by default creation ./artemis create?
It doesn't. See above.
What should we write in connector?
The URL needed to connect.
Can you give a simple example when acceptor and connector are different?
As mentioned previously, bridging is an example where different acceptors and connectors are used. ActiveMQ Artemis ships with a "core-bridge" example in the examples/features/standard directory which demonstrates different acceptors and connectors. The example involves 2 different brokers with one broker having a core bridge configured to send messages to the other broker. Here's the broker.xml with the bridge defined. You can see the acceptor listening on the localhost:61616 and the connector for localhost:61617. This connector points to the other broker which is listening on localhost:61617.

How to see what kafka client application connected from a certain ip address?

Although kafka clients are authenticated, and can be restricted (authorized) to connect only from allowed ip addresses, on these app servers multiple applications may be deployed, and it would be beneficial if kafka admin could somehow match certain connection (visible only via netstat on kafka server machine!) to a certain application, either by an application tag passed explicitely by kafka client, or by command name that started the kafka client application passed by the local client operating system to a kafka client (and visible via ps command on unix), and passed through kafka client to kafka broker. Is there already such a possibility?
That would imply that connections are held as browsable objects within kafka, somewhere, either in some internal topic, or in its zookeeper.
Alternatively, at least displaying the authorized principal that initiated connection would also do. That is a question for both consumers and producers.

Kafka, configuring a single vs multiple broker servers for clients?

I use to configure bootstrap.servers in my kafka producer/consumer/stream apps with a list of broker ips. But I’d like to move to a single url entry that will be resolved by the DNS lookup to a broker ip currently known as up (DNS actively check the brokers in the cluster and responds to lookup with an IP short TTL [10s]). This gives me more flexibility to add brokers in the future, and I can keep the same config in my apps across all the environments/stages. Is this a recommended approach, or this remove resiliency on the client side to not have a strict list of brokers? I assume this config would only be used to initially “discover” the cluster and the partition leader brokers.
If anything, I'd say this adds a single point of failure on the single address you're providing, unless it's actually a load balanced, reverse proxy.
Another possibility that's worked somewhat well internally is using Consul service discovery, with Consul agents running on each broker. This way, you can do service discovery as well as health checks and easier monitoring setup, e.g. having Prometheus jmx_exporter on the brokers, and Prometheus Server scraping those values for all kafka.service.consul addresses

How to run two instances of schema registry

I am trying to run Kafka in cluster mode using two instances of schema registry but I am not quite sure how to configure the second instance so that it takes over in case the first one is down.
Here's the properties file for the first schema-registry instance:
port=8081
# The address the socket server listens on.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
listeners=http://0.0.0.0:8081
# Zookeeper connection string for the Zookeeper cluster used by your Kafka cluster
# (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
kafkastore.connection.url=localhost:2181,localhost:2182,localhost:2183
# Alternatively, Schema Registry can now operate without Zookeeper, handling all coordination via
# Kafka brokers. Use this setting to specify the bootstrap servers for your Kafka cluster and it
# will be used both for selecting the master schema registry instance and for storing the data for
# registered schemas.
# (Note that you cannot mix the two modes; use this mode only on new deployments or by shutting down
# all instances, switching to the new configuration, and then starting the schema registry
# instances again.)
#kafkastore.bootstrap.servers=localhost:9092
# The name of the topic to store schemas in
kafkastore.topic=_schemas
# If true, API requests that fail will include extra debugging information, including stack traces
debug=false
How should the second file look like so that they can both communicate with zookeeper and achieve high availability?
You can use either the Kafka leader election or Zookeeper leader election.
The only thing you need to change between two instances on the same machine connected to the same Kafka/Zookeeper is the port and listeners property
To appropriately configure high availability, you need a HTTP load balancer for giving one address for all instances.