I am trying to deploy a k8s cluster in openstack rocky but after long time it fails. I've checked orchestration stack and see that kube_minions resources never completes. Checking the logs output for all the instances created:
[ 196.817505] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 215.082433] random: crng init done
Fedora 27 (Atomic Host)
Kernel 4.14.18-300.fc27.x86_64 on an x86_64 (ttyS0)
host-10-0-0-3 login: [ 691.438618] bridge: filtering via
arp/ip/ip6tables is no longer available by default. Update your scripts
to load br_netfilter if you need this.
[ 691.516277] Bridge firewalling registered
[ 692.149217] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 701.932912] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
Checking deeply in the instances I've found that in master node can not start heat-agent-service...
_prefix=docker.io/openstackmagnum/
atomic install --storage ostree --system --system-package no --set
REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt --name heat-container-
agent docker.io/openstackmagnum/heat-container-agent:rocky-stable
systemctl start heat-container-agent
Failed to start heat-container-agent.service: Unit heat-container-
agent.service not found.
2019-04-04 14:57:40,238 - util.py[WARNING]: Failed running
/var/lib/cloud/instance/scripts/part-013 [5]´
Related
I am trying to install kubernetes with kubeadm in my laptop which has Ubuntu 16.04. I have disabled swap, since kubelet does not work with swap on. The command I used is :
swapoff -a
I also commented out the reference to swap in /etc/fstab.
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda1 during installation
UUID=1d343a19-bd75-47a6-899d-7c8bc93e28ff / ext4 errors=remount-ro 0 1
# swap was on /dev/sda5 during installation
#UUID=d0200036-b211-4e6e-a194-ac2e51dfb27d none swap sw 0 0
I confirmed swap is turned off by running the following:
free -m
total used free shared buff/cache available
Mem: 15936 2108 9433 954 4394 12465
Swap: 0 0 0
When I start kubeadm, I get the following error:
kubeadm init --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.14.2
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
I also tried restarting my laptop, but I get the same error. What could the reason be?
below was the root cause.
detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd".
you need to update the docker cgroup driver.
follow the below fix
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
# Restart Docker
systemctl daemon-reload
systemctl restart docker
you could try kubeadm reset , then kubeadm init --ignore-preflight-errors Swap .
first try with sudo
sudo swapoff -a
then check if there's anything swapped
cat /proc/swaps
and
free -h
I have a service that, in some contexts, sends requests to itself.
I can reach the service from outside the cluster, but the self-requests fail (time-out).
Environment:
minikube v0.34.1
Linux version 4.15.0 (jenkins#jenkins) (gcc version 7.3.0 (Buildroot 2018.05)) #1 SMP Fri Feb 15 19:27:06 UTC 2019
I've been using https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#a-pod-cannot-reach-itself-via-service-ip as a troubleshooting guide, but I'm down the step that says "seek help".
Troubleshooting results:
journalctl -u kubelet | grep -i hairpin
Feb 26 19:57:10 minikube kubelet[3066]: W0226 19:57:10.124151 3066 docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
Feb 26 19:57:10 minikube kubelet[3066]: I0226 19:57:10.124295 3066 docker_service.go:236] Hairpin mode set to "hairpin-veth"
The troubleshooting guide indicates that "hairpin-veth" is OK.
for intf in /sys/devices/virtual/net/docker0/brif/veth*; do cat $intf/hairpin_mode; done
0
...
0
Note that the guide used /sys/devices/virtual/net/cbr0/brif/*, but in this version of minikube, the path is /sys/devices/virtual/net/docker0/brif/veth*. I'd like to understand why the paths are different, but it appears that hairpin_mode is not enabled.
The next step in the guide is: Seek help if none of above works out.
Am I correct in believing that I need to enable hairpin_mode?
If so, how do I do so?
It seems like known issue, more information here:
As workaround you can try:
minikube ssh -- sudo ip link set docker0 promisc on
Please share with the reulsts.
I try to stack up my kubeadm cluster with three masters. I receive this problem from my init command...
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
But I do not use no cgroupfs but systemd
And my kubelet complain for not knowing his nodename.
Jan 23 14:54:12 master01 kubelet[5620]: E0123 14:54:12.251885 5620 kubelet.go:2266] node "master01" not found
Jan 23 14:54:12 master01 kubelet[5620]: E0123 14:54:12.352932 5620 kubelet.go:2266] node "master01" not found
Jan 23 14:54:12 master01 kubelet[5620]: E0123 14:54:12.453895 5620 kubelet.go:2266] node "master01" not found
Please let me know where is the issue.
The issue can be because of docker version, as docker version < 18.6 is supported in latest kubernetes version i.e. v1.13.xx.
Actually I also got the same issue but it get resolved after downgrading the docker version from 18.9 to 18.6.
If the problem is not related to Docker it might be because the Kubelet service failed to establish connection to API server.
I would first of all check the status of Kubelet: systemctl status kubelet and consider restarting with systemctl restart kubelet.
If this doesn't help try re-installing kubeadm or running kubeadm init with other version (use the --kubernetes-version=X.Y.Z flag).
In my case,my k8s version is 1.21.1 and my docker version is 19.03. I solved this bug by upgrading docker to version 20.7.
What are installed for minikube:
$ ls -al /usr/local/bin/
-rwxr-xr-x 1 root root 26406912 Jun 14 12:05 docker-machine
-rwxrwxr-x 1 me libvirtd 11889064 Jun 14 12:07 docker-machine-driver-kvm
-rwxrwxr-x 1 me me 70232912 Jun 14 11:58 kubectl
-rwxrwxr-x 1 me me 82512696 Jun 14 11:57 minikube
Trying to start cluster by minikube
$ minikube start --vm-driver=kvm
Starting local Kubernetes v1.6.4 cluster...
Starting VM...
E0614 12:07:39.515994 14655 start.go:127] Error starting host: Error creating host: Error creating machine: Error in driver during machine creation: virError(Code=8, Domain=44, Message='invalid argument: could not find capabilities for domaintype=kvm ').
Retrying.
E0614 12:07:39.517076 14655 start.go:133] Error starting host: Error creating host: Error creating machine: Error in driver during machine creation: virError(Code=8, Domain=44, Message='invalid argument: could not find capabilities for domaintype=kvm ')
I am new to kubernetes. Any idea how to fix it? Thanks
UPDATE
sudo /usr/sbin/kvm-ok
INFO: /dev/kvm does not exist
HINT: sudo modprobe kvm_intel
INFO: Your CPU supports KVM extensions
INFO: KVM (vmx) is disabled by your BIOS
HINT: Enter your BIOS setup and enable Virtualization Technology (VT),
and then hard poweroff/poweron your system
KVM acceleration can NOT be used
$ dmesg | grep kvm
[ 2.114855] kvm: disabled by bios
[ 2.327746] kvm: disabled by bios
[ 120.423249] kvm: disabled by bios
[ 222.250977] kvm: disabled by bios
My update is close to the solution. The solution is to enable virtualization in the BIOS.
1, Power on your PC and open the BIOS.
2, Go to the security section and enable virtualization.
you need to install the kvm package refer package.
https://github.com/kubernetes/minikube/blob/master/docs/drivers.md#kvm-driver
# Install libvirt and qemu-kvm on your system, e.g.
# Debian/Ubuntu
$ sudo apt install libvirt-bin qemu-kvm
# Fedora/CentOS/RHEL
$ sudo yum install libvirt-daemon-kvm kvm
# Add yourself to the libvirtd group (use libvirt group for rpm based distros) so you don't need to sudo
# Debian/Ubuntu (NOTE: For Ubuntu 17.04 change the group to `libvirt`)
$ sudo usermod -a -G libvirtd $(whoami)
# Fedora/CentOS/RHEL
$ sudo usermod -a -G libvirt $(whoami)
# Update your current session for the group change to take effect
# Debian/Ubuntu (NOTE: For Ubuntu 17.04 change the group to `libvirt`)
$ newgrp libvirtd
# Fedora/CentOS/RHEL
$ newgrp libvirt
While trying to run a puppet update form a node:
sudo /opt/puppetlabs/bin/puppet agent -t
I get an error:
Error: Could not retrieve catalog; skipping run
Error: Could not send report: Connection refused - connect(2) for "puppet" port 8140`
Elsewhere indicates this is likely a problem with the puppetserver service, and suggests to reboot the server. Restarting didn't help, and when I try to restart the service I get failure:
~$ sudo service puppetserver restart
Job for puppetserver.service failed because the control process exited with error code. See "systemctl status puppetserver.service" and "journalctl -xe" for details.
I've looked at these logs, and as a puppet/linux noob, I'm not sure what to do next.
systemctl status puppetserver.service
● puppetserver.service - puppetserver Service
Loaded: loaded (/lib/systemd/system/puppetserver.service; enabled; vendor preset: enabled)
Active: activating (start-post) since Fri 2016-09-02 15:54:26 PDT; 2s ago
Process: 22301 ExecStartPre=/usr/bin/install --directory --owner=puppet --group=puppet --mode=775 /var/run/puppetlabs/puppetserver (code=exited
Main PID: 22306 (java); : 22307 (bash)
Tasks: 17
Memory: 335.7M
CPU: 5.535s
CGroup: /system.slice/puppetserver.service
├─22306 /usr/bin/java -Xms6g -Xmx6g -XX:MaxPermSize=256m -XX:OnOutOfMemoryError=kill -9 %p -Djava.security.egd=/dev/urandom -cp /opt/p
└─control
├─22307 /bin/bash /opt/puppetlabs/server/apps/puppetserver/ezbake-functions.sh wait_for_app
└─22331 sleep 1
Sep 02 15:54:26 puppet systemd[1]: Starting puppetserver Service...
Sep 02 15:54:26 puppet java[22306]: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
puppet version 4.6.1
The puppet master communicates with the other node using port number 8140.
I don't think a restart will help, since this looks like a connection issue between the server and the node.
please try the following -
first make sure that the puppet master is actually listening on port 8140. run the following command on the puppetmaster -
netstat -ntlp | grep 8140
this command should return something like this -
tcp 0 0 0.0.0.0:8140 0.0.0.0:* LISTEN 1783/puppetmaster
If you don't get the same output, your puppetmaster is not listening, and therefore can not compile catalogs for the node.
Try checking the puppet master log at /var/log/puppetmaster.log
check that the node can communicate with the puppetmaster on the relevant port. you can check this quickly with the telnet command. run this on your node -
telnet < puppetmaster ip address \ dns name> 8140
you should get something like -
Connected to <puppet-master-IP/DNS-name>
Escape character is '^]'.
if you don't get this output, this means that something is blocking you from accessing the puppetmaster. try opening the port in your firewall to access the puppetmaster.
if you're still stuck try using the --debug flag for verbose output and edit your question.
Could be 2 things: (1) in puppet.conf you have configured more memory than you have on your machine. Or (2) You installed both apt-get install puppetserver and apt-get install puppet.
If you get failed to start puppet.service: unit not found. error on slave machine while connecting to puppet.
Close the putty and then again open and connect it.The issue wont come while starting putty on slave.
The error occurs because there is not enough RAM and to fix the error, open the Puppet server configuration file:
sudo nano /etc/sysconfig/puppetserver
And reduce the amount of allocated RAM for the Puppet server (for example, I specified 512m instead of 2g):
JAVA_ARGS="-Xms512m -Xmx512m"
Now let’s start the Puppet server:
sudo systemctl start puppetserver