“jmx_prometheus_javaagent” not working on Kafka cluster - apache-kafka

I am trying to add prometheus jmx agent (jmx_prometheus_javaagent-0.3.1.jar) to an existing kafka cluster.
But when I run the java agent, I am not getting response on the port as it says-
curl http://localhost:8080
curl: (7) Failed connect to localhost:8080; Connection refused
Here is my configuration file "kafka.service":
[kafka#Kafka-dev prometheus]$ cat /etc/systemd/system/kafka.service
[Unit]
Description=Kafka
After=network.target
[Service]
User=kafka
Group=kafka
Environment="KAFKA_HEAP_OPTS=-Xmx256M -Xms128M"
Environment="KAFKA_OPTS=-javaagent:/home/kafka/prometheus/jmx_prometheus_javaagent-0.3.1.jar=8080:/home/kafka/prometheus/kafka-0-8-2.yml"
ExecStart=/home/kafka/kafka/bin/kafka-server-start.sh -daemon /home/kafka/kafka/config/server.properties
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
Then when I start Kafka.service it looks that it works:
sudo systemctl restart kafka
But when I check the status I find that the service is inactive:
[kafka#Kafka-dev ~]$ sudo systemctl status kafka.service
● kafka.service - Kafka
Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Thu 2019-12-05 10:00:17 UTC; 1min 0s ago
Process: 125469 ExecStart=/home/kafka/kafka/bin/kafka-server-start.sh -daemon /home/kafka/kafka/config/server.properties (code=exited, status=0/SUCCESS)
Main PID: 125469 (code=exited, status=0/SUCCESS)
Note- firewalls on the machine are disabled.
I'm suspecting this has something to do with the configuration of jmx_prometheus_javaagent.

Related

Cannot Launch Prothetheus App for RPI, Error code 2

I am trying to run Prometheus' standalone app on an RPI4 8GB. I am following the instructions laid out here: https://pimylifeup.com/raspberry-pi-prometheus/
My prometheus.service file is this:
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=pi
Restart=on-failure
ExecStart=/home/pi/prometheus/prometheus \
--config.file=/home/pi/prometheus/prometheus.yml \
--storage.tsdb.path=/home/pi/prometheus/data
[Install]
WantedBy=multi-user.target
But when I try to run the service I get the following error.
● prometheus.service - Prometheus Server
Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2022-11-24 18:42:51 GMT; 2s ago
Docs: https://prometheus.io/docs/introduction/overview/
Process: 485265 ExecStart=/home/pi/prometheus/prometheus --config.file=/home/pi/prometheus/prometheus.yml --storage.tsdb.path=/home/pi/prometheus/data (code=exited, status=2)
Main PID: 485265 (code=exited, status=2)
CPU: 160ms
Nov 24 18:42:51 master2 systemd[1]: prometheus.service: Scheduled restart job, restart counter is at 5.
Nov 24 18:42:51 master2 systemd[1]: Stopped Prometheus Server.
Nov 24 18:42:51 master2 systemd[1]: prometheus.service: Start request repeated too quickly.
Nov 24 18:42:51 master2 systemd[1]: prometheus.service: Failed with result 'exit-code'.
Nov 24 18:42:51 master2 systemd[1]: Failed to start Prometheus Server.
What does Error Status 2 mean in this context? Is it a permission problem, or something else?

Drain Node at Shutdown

I want to drain node on shutdown and uncordon on start, I wrote below unit file but i am getting error (Openshift 3.11 and Kubernetes 1.11.0)
[Unit]
Description=Drain Node at Shutdown
DefaultDependencies=no
Before=shutdown.target reboot.target halt.target
[Service]
Type=oneshot
ExecStart=/bin/sleep 60 && kubectl uncordon $HOSTNAME
ExecStop=kubectl drain $HOSTNAME --ignore-daemonsets --force --grace-period=30 && /bin/sleep 60
[Install]
WantedBy=halt.target reboot.target shutdown.target
its giving me error
error: no configuration has been provided
I set Environment variable but still no success
[Service]
Environment="KUBECONFIG=$HOME/.kube/config"
Following systemd unit is working, in ExecStop %H should be use for HOSTNAME
[Unit]
Description=Drain Node at Shutdown
After=network.target glusterd.service
[Service]
Type=oneshot
Environment="KUBECONFIG=/root/.kube/config"
ExecStart=/bin/true
ExecStop=/usr/bin/kubectl drain %H --ignore-daemonsets --force --grace-period=30 --delete-local-data
TimeoutStopSec=200
# This service shall be considered active after start
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target

kubelet service is not starting

While running commands such as kubectl get nodes resulting with following error:
The connection to the server :6443 was refused - did you specify the right host or port?
I ran systemctl status kubelet.service and receiving the following state:
root#k8s-l2bridge-ma:~# sudo systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Tue 2020-06-16 11:46:05 UTC; 9s ago
Docs: https://kubernetes.io/docs/home/
Process: 28012 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
Main PID: 28012 (code=exited, status=255)
Jun 16 11:46:05 k8s-l2bridge-ma systemd[1]: kubelet.service: Failed with result 'exit-code'.
How can I troubleshoot the failure and find out what is wrong? I found few leads googling but nothing solved the problem.
Just make the modification on the file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false"
then execute commands:
$ systemctl daemon-reload
$ systemctl restart kubelet
Take a look: fail-kubelet-service, kubelet-failed-start.
in my case deleting swap memory worked out.
swapoff -a
To permanently disable Linux swap space, open the /etc/fstab file, search for a swap line and add a # (hashtag) sign in front of the line to comment on the entire line.

Readiness probe failed: /usr/local/bin/status-probe.sh failed check: systemctl -q is-active glusterd.service

I've followed the setup guide using the gk-deploy script
./gk-deploy.sh --admin-key xxxx --user-key xxx -v -g
This went fine. However, the deployed container fail with
Readiness probe failed: /usr/local/bin/status-probe.sh failed check: systemctl -q is-active glusterd.service
The glusterd.service on the node is running:
$ systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2019-10-19 18:48:25 CEST; 1 day 17h ago
Docs: man:glusterd(8)
Process: 939 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_
Main PID: 955 (glusterd)
Tasks: 10 (limit: 4915)
Memory: 30.5M
CGroup: /system.slice/glusterd.service
└─955 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
....
GLUSTER_BLOCKD_STATUS_PROBE_ENABLE is set to "0"
I am aware of this post, but does not really help me solve my problem
Let's try with a simpler readiness probe to diagnose the problem. I would run the following script to make sure the service is indeed ok given this is the check that is failing.
systemctl -q is-active glusterd.service
if [[ $? -eq 0 ]]; then
exit 0;
else
exit 1;
fi;
There is no point in checking the logs since systemctl will give us the necessary exit code.

How to start kubelet service?

I ran command
systemctl stop kubelet
then try to start it
systemctl start kubelet
but can't able to start it
here is the output of systemctl status kubelet
kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Wed 2019-06-05 15:35:34 UTC; 7s ago
Docs: https://kubernetes.io/docs/home/
Process: 31697 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
Main PID: 31697 (code=exited, status=255)
Because of this i am not able to run any kubectl command
example kubectl get pods gives
The connection to the server 172.31.6.149:6443 was refused - did you specify the right host or port?
Worked
Need to disable swap using swapoff -a
then,
try to start it systemctl start kubelet
So i need to reset kubelete service
Here are the step :-
check status of your docker service.
If stoped,start it by cmd sudo systemctl start docker.
If not installed installed it
#yum install -y kubelet kubeadm kubectl docker
Make swap off by #swapoff -a
Now reset kubeadm by #kubeadm reset
Now try #kudeadm init
after that check #systemctl status kubelet
it will be working
Check nodes
kubectl get nodes
if Master Node is not ready ,refer following
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
if you not able to create pod ..check dns
kubectl get pods --namespace=kube-system
if dns pods are in pending state
i.e you need to use network service
i used calico
kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml
Now your master node is ready .. now you can deploy pod