Ceph status do not respond when ntp is running - centos

Command ceph status doesn't response if we put any servers in /etc/ntp.conf
I have 3 ceph nodes on centos 7 with this /etc/ntp.conf:
driftfile /vat/lib/ntp/drift
restrict 0.0.0.0 mask 0.0.0.0
server 0.ua.pool.ntp.org iburst
server 1.ua.pool.ntp.org iburst
server 2.ua.pool.ntp.org iburst
server 3.ua.pool.ntp.org iburst
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
disable monitor
and with this /etc/rc.local:
touch /var/lock/subsys/local
/sbin/iptables-restore < /etc/sysconfig/iptables
/sbin/ntpd -gq
/sbin/hwclock --systohc
systemctl enable ntpd.service
systemctl start ntpd.service
If I comment servers in /etc/ntp.conf:
#server 0.ua.pool.ntp.org iburst
#server 1.ua.pool.ntp.org iburst
#server 2.ua.pool.ntp.org iburst
#server 3.ua.pool.ntp.org iburst
then ceph becomes response. But with this answer:
health HEALTH_WARN
clock skew detected on mon.node2, mon.node3
Monitor clock skew detected
...
systemctl status ntpd.service show that service is active and running.
I really cant understand why ceph becomes unresponsable if we put servers in ntp.conf.
Please, help me.

I don't know why but it have happened because of command in ntpd -gq. This command updates your datatime from servers which you have written in ntp.conf and then stops.
I can't figure out why ceph fails after this command but when I changed it to:
ntpdate 0.ua.pool.ntp.org
Ceph became work.

Related

Kubernetes Service pinging not working time to time "Temporary fail in name resolution"

I have two separate clusters (Application and DB) in the same namespace. Statefulset for DB cluster and Deployment for Application cluster. For internal communication I have configured a Headless Service. When I ping from a pod in application cluster to the service it works (Works the other way round too - DB pod to service works). But sometimes, for example if I continuously execute ping command for like 3 times, the third time it gives an error - "ping: : Temporary failure in name resolution". Why is this happening?
As far as I know this is usually a name resolution error and shows that your DNS server cannot resolve the domain names into their respective IP addresses. This can present a grave challenge as you will not be able to update, upgrade, or even install any software packages on your Linux system. Here I have listed few reasons
1.Forgot configuring or Wrongly Configured resolv.conf File
The /etc/resolv.conf file is the resolver configuration file in Linux systems. It contains the DNS entries that help your Linux system to resolve domain names into IP addresses.
If this file is not present or is there but you are still having the name resolution error, create one and append the Google public DNS server as nameserver 8.8.8.8
Save the changes and restart the systemd-resolved service as shown.
$ sudo systemctl restart systemd-resolved.service
It’s also prudent to check the status of the resolver and ensure that it is active and running as expected:
$ sudo systemctl status systemd-resolved.service
2. Due to Firewall Restrictions
By some chance if the first solution did not work for you, firewall restrictions could be preventing you from successfully performing DNS queries. Check your firewall and confirm if port 53 (used for DNS – Domain Name Resolution ) and port 43 are open. If the ports are blocked, open them as follows:
For UFW firewall (Ubuntu / Debian and Mint)
To open ports 53 & 43 on the UFW firewall run the commands below:
$ sudo ufw allow 43/tcp
$ sudo ufw reload```
For firewalld (RHEL / CentOS / Fedora)
For Redhat based systems such as CentOS, invoke the commands below:
```$ sudo firewall-cmd --add-port=53/tcp --permanent
$ sudo firewall-cmd --add-port=43/tcp --permanent
$ sudo firewall-cmd --reload
I hope that you now have an idea about the ‘temporary failure in name resolution‘ error. I also found a similar git issue hope that helps
https://github.com/kubernetes/kubernetes/issues/6667

kubectl : Unable to connect to the server : dial tcp 192.168.214.136:6443: connect: no route to host

I recently installed kubernetes on VMware and also configured few pods , while configuring those pods , it automatically used IP of the VMware and configured. I was able to access the application during that time but then recently i rebooted VM and machine which hosts the VM, during this - IP of the VM got changed i guess and now - I am getting below error when using command kubectl get pod -n <namspaceName>:
userX#ubuntu:~$ kubectl get pod -n NameSpaceX
Unable to connect to the server: dial tcp 192.168.214.136:6443: connect: no route to host
userX#ubuntu:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: dial tcp 192.168.214.136:6443: connect: no route to host
kubectl cluster-info as well as other related commands gives same output.
in VMware workstation settings, we are using network adapter which is sharing host IP address setting. We are not sure if it has any impact.
We also tried to add below entry in /etc/hosts , it is not working.
127.0.0.1 localhost \n
192.168.214.136 localhost \n
127.0.1.1 ubuntu
I expect to run the pods back again to access the application.Instead of reinstalling all pods which is time consuming - we are looking for quick workaround so that pods will get back to running state.
If you use minikube sometimes all you need is just to restart minikube.
Run:
minikube start
I encountered the same issue - the problem was that the master node didn't expose port 6443 outside.
Below are the steps I took to fix it.
1 ) Check IP of api-server.
This can be verified via the .kube/config file (under server field) or with: kubectl describe pod/kube-apiserver-<master-node-name> -n kube-system.
2 ) Run curl https://<kube-apiserver-IP>:6443 and see if port 6443 is open.
3 ) If port 6443 you should get something related to the certificate like:
curl: (60) SSL certificate problem: unable to get local issuer certificate
4 ) If port 6443 is not open:
4.A ) SSH into master node.
4.B ) Run sudo firewall-cmd --add-port=6443/tcp --permanent (I'm assuming firewalld is installed).
4.C ) Run sudo firewall-cmd --reload.
4.D ) Run sudo firewall-cmd --list-all and you should see port 6443 is updated:
public
target: default
icmp-block-inversion: no
interfaces:
sources:
services: dhcpv6-client ssh
ports: 6443/tcp <---- Here
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
The common practice is to copy config file to the home directory
sudo cp /etc/kubernetes/admin.conf ~/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config
Also, make sure that api-server address is valid.
server: https://<master-node-ip>:6443
If not, you can manually edit it using any text editor.
You need to export the admin.conf file as kubeconfig before running the kubectl commands. You may put this as your env variable
export kubeconfig=<path>/admin.conf
after this you should be able to run the kubectl command. I am hoping that your setup of K8S cluster is proper.
Last night I had the exact same error installing Kubernetes using this puppet module: https://forge.puppet.com/puppetlabs/kubernetes
Turns out that it is an incorrect iptables setting in the master that blocks all non-local requests towards the api.
The way I solved it (bruteforce solution) is by
completely remove alle installed k8s related software (also all config files, etcd data, docker images, mounted tmpfs filesystems, ...)
wipe the iptables completely https://serverfault.com/questions/200635/best-way-to-clear-all-iptables-rules
reinstall
This is what solved the problem in my case.
There is probably a much nicer and cleaner way to do this (i.e. simply change the iptables rules to allow access).
if you getting the below error then you also check once the token validity.
Unable to connect to the server: dial tcp 192.168.93.10:6443: connect: no route to host
Check your token validity by using the command kubeadm token list if your token is expired then you have to reset the cluster using kubeadm reset and than initialize again using command kubeadm init --token-ttl 0.
Then again check the status of the token using kubeadm token list. Note here the TTL value will be <forever> and Expire value will be <never>.
example:-
[root#master1 ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
nh48tb.d79ysdsaj8bchms9 <forever> <never> authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
Ubuntu 22.04 LTS Screenshot
Select docker-desktop and run again your command, e.g kubectl apply -f <myimage.yaml>
Run minikube start command
The reason behind that is your minikube cluster with driver docker stopped
when you shutdown the system
To all those who are trying to learn and experiment kubernetes using Ubuntu on Oracle VM:
IP address is assigned to Guest OS/VM based on the network adapter selection. Based on your network adapter selection, you need to configure the settings in Oracle VM network section or your router settings.
See the link for most common Oracle VM network adapter.
https://www.nakivo.com/blog/virtualbox-network-setting-guide/
I was using bridge adapter which put VM and host OS in parallel. So the my router was randomly assigning IP to my VM after every restart and my cluster stopped working and getting the same exact error message posted in the question.
> k get pods -A
> Unable to connect to the server: dial tcp 192.168.214.136:6443: connect: no route to host
> systemctl status kubelet
> ........
> ........ "Error getting node" err="node \"node\" not found"
Cluster started working again after reserving static IP address to my VM in router settings.(if you are using NAT adapter, you should configure it in VM network settings)
When you are reserving IP address to your VM, make sure to assign the same old IP address which was used for configuring kubelet.

How to completely purge minikube config or reset IP back to 192.168.99.100

I want to completely purge Minikube so that I could start over as if I installed it for the first time to avoid some configuration clashes. Mailnly to have initial IP 192.168.99.100, unfortunatelly it increases on next minikube start to 192.168.99.101, etc. I've run to delete Minikube:
minikube delete
rm -rf ~/.minikube
rm -rf ~/.kube
I'm running minikube version: v0.31.0 on Ubuntu 18.04 with driver VirtualBox 5.2.18
I found this problem on the Mac with VirtualBox as well. I tried removing the Host Network Manager, but it did not work for me. However, I did find another solution.
After issuing minikube delete, I removed the following file:
/Users/{username}/Library/VirtualBox/HostInterfaceNetworking-vboxnet0-Dhcpd.leases
Starting minikube again reset the address to .100.
File contents:
<?xml version="1.0"?>
<Leases version="1.0">
<Lease mac="08:00:27:66:6a:19" id="01080027666a19" network="0.0.0.0" state="expired">
<Address value="192.168.99.102"/>
<Time issued="1555277299" expiration="1200"/>
</Lease>
<Lease mac="08:00:27:08:03:a3" id="010800270803a3" network="0.0.0.0" state="expired">
<Address value="192.168.99.101"/>
<Time issued="1555276993" expiration="1200"/>
</Lease>
<Lease mac="08:00:27:32:ed:f8" id="0108002732edf8" network="0.0.0.0" state="expired">
<Address value="192.168.99.100"/>
<Time issued="1555276537" expiration="1200"/>
</Lease>
</Leases>
I've recently run into this issue on mpb; tracking down issues with minikube helm and tiller on VirtualBox v6.0.10
Cleanest solution I've found works as expected
#!/bin/sh
function minikube_reset_vbox_dhcp_leases() {
# # Reset Virtualbox DHCP Lease Info
echo "Resetting Virtualbox DHCP Lease Info..."
kill -9 $(ps aux |grep -i "vboxsvc\|vboxnetdhcp" | awk '{print $2}') 2>/dev/null
if [[ -f ~/Library/VirtualBox/HostInterfaceNetworking-vboxnet0-Dhcpd.leases ]] ; then
rm ~/Library/VirtualBox/HostInterfaceNetworking-vboxnet0-Dhcpd.leases
fi
}
minikube_reset_vbox_dhcp_leases
Credit: issues/951
Minikube is used on different platforms, so it might be helpful to add information related to most popular of them.
Minikube is not responsible for assigning IP address to its VM.
If you are starting minikube on Windows or MacOS new VM is created. That VM gets the first available IP address from the pool of hypervisor DHCP service. In brief, DHCP service reserves this IP for the VM for some period of time, usually from 24 hours to 7 days. If in this period the client doesn't refresh DHCP lease and this IP is not available on the network, the IP is considered free and can be offered to another client.
VirtualBox has only basic settings for its DHCP service, you are not allowed to configure lease time or static ip binding. So, you may try to change ip configuration of minikube VM network interface, after VM is created using minikube ssh. Or you can play with VM MAC address right after creation because DHCP offers IP address based on host MAC address.
HyperV uses existed DHCP on local network for shared networks or manually configured DHCP server for internal networks. If you have access to DHCP administration console you can delete the old minikube VM IP binding before starting a new VM using minikube start.
For Linux you can choose two options, you can use virtualbox hypervisor and create VM like it works on Windows or MAC, so DHCP will work like I've mentioned previously, or you can use -vm-driver=none argument and setup Kubernetes cluster inside host environment without VM. In this case your host machine becomes a Kubernetes master node with the same IP configuration.
If you use virtualbox as the vm-driver, you can use this Python script on Linux/Mac to reset the ip to 192.168.99.100:
./minikube_reset
#!/usr/bin/env python3
import subprocess as sp
from sys import platform
import os
if __name__ == "__main__":
print("Resetting Virtualbox DHCP...")
procs = sp.run("ps aux", shell=True, stdout=sp.PIPE)\
.stdout.decode("utf8").lower().split('\n')
pids = [
p.split()[1] for p in procs if 'vboxsvc' in p or 'vboxnetdhcp' in p
]
for pid in pids:
sp.run(['kill', '-9', pid])
cfg_dir = ".config" if platform != 'darwin' else 'Library'
file = f"~/{cfg_dir}/VirtualBox/HostInterfaceNetworking-vboxnet0-Dhcpd.leases"
try:
os.remove(os.path.expanduser(file))
except OSError as e:
pass
If you make the script an executable via chmod +x minikube_reset and put it into your path, you can run:
minikube stop # Stop your running minikube instance.
minikube_reset # Reset the ip.
minikube start # Start new minikube instance with 192.168.99.100.
Your minikube instance should always start with 192.168.99.100 after minikube_reset.
For Ubuntu 18.0.4 you can try
rm -r /home/username/.config/VirtualBox/HostInterfaceNetworking-vboxnet0-Dhcpd.leases

Minikube on Windows with VirtualBox: Connection attempt fail

I got Kubernetes Minikube on my laptop (4cores, 8 GB RAM). I just performed the basic installation steps (got miniKube and kubectl, enabled the BIOS virtualization) and I am able to start the cluster:
C:\Users\me>minikube start
Starting local Kubernetes cluster...
Starting VM...
SSH-ing files into VM...
Setting up certs...
Starting cluster components...
Connecting to cluster...
Setting up kubeconfig...
Kubectl is now configured to use the cluster.
However, when I try to interact with the cluster, I allways get the same error, sample:
C:\Users\me>kubectl get pods --context=minikube
Unable to connect to the server: dial tcp 192.168.99.100:8443: connectex: A connection attempt failed because the connected party
did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
I execute minikube ip and I ping the result IP and I get a response. Also I tried to give more memory (3Gb vs the standard 2Gb) and nothing changed.
Am I doing something wrong here?
Thanks!
I had same issue as above. I found out that kubectl couldn't connect to the cluster and would throw up the error when i'm on a VPN connection. When I turned off my VPN client, it started working as fine.
I think it could be some problem with the cluster, when I run minikube status I've got the mixed results of cluster running and cluster stopped:
First run:
c:\> minikube status
minikube: Running
cluster: Stopped
kubectl: Correctly Configured: pointing to minikube-vm at 192.168.99.100
Second run:
minikube: Running
cluster: Running
kubectl: Correctly Configured: pointing to minikube-vm at 192.168.99.100
Third run:
minikube: Running
cluster: Stopped
kubectl: Correctly Configured: pointing to minikube-vm at 192.168.99.100
The service is flapping.
UPDATED:
Connecting to the minikube vm using minikube ssh I realized the kubeconfig file have wrong path separator for certificates generated by minikube automatic configuration. The path on kubeconfig file stands for \var\lib\localkube\certs\ca.cert and it have to be /var/lib/localkube/certs/ca.cert and so on...
To update the file I have to copy the content of the orignal file to my desktop, fix the directory separators and save the correct file to /var/lib/localkube/kubeconfig and restart the service using:
sudo systemclt restart localkube.
I hope everyone can use minikube with this tip.
If it keep to hit 8443 connection issue when changed work environment, would simplify turn off TLS verification for minikube local cluster if there is not clue.
https://github.com/robertluwang/docker-hands-on-guide/blob/master/minikube-no-tls-verify.md
Hope it is helpful for you.
BR/
Robert
from the documentation:
for Troubleshooting
Run minikube start --alsologtostderr -v=7 to debug crashes
I had the same problem:
check if a some service of a VPN is running by checking the task management, for me, I had a running service of my VPN, so kill the task and try to run the command showed above

Kubernetes configuration step 2 CentOS 7

From http://kubernetes.io/docs/getting-started-guides/kubeadm/
CentOS Linux release 7.2.1511 (Core)
(1/4) Installing kubelet and kubeadm on your hosts
.....
it's ok
$sudo docker -v
Docker version 1.10.3, build cb079f6-unsupported
$sudo kubeadm version
$kubeadm version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.0.1534+cf7301f16c0363-dirty", GitCommit:"cf7301f16c036363c4fdcb5d4d0c867720214598", GitTreeState:"dirty", BuildDate:"2016-09-27T18:10:39Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
$sudo systemctl enable docker && systemctl start docker
$sudo systemctl enable kubelet && systemctl start kubelet
it's ok again
$ sudo kubeadm init
<master/tokens> generated token: "15a340.9910f948879b5d99"
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready
And at that place proccess stopped.
Probably, i can'nt understand something, but RedHat OpenShift version 3 use kubernetes+docker. I tried OpenShift v3 docker version download - it was ok.
I fixed that issue with a likewise setup by declaring the private ip address as localhost in the /etc/hosts file.
Example: /etc/hosts
10.0.0.2 localhost
Then I run a problem where kubectl get nodes threw:
The connection to the server localhost:8080 was refused - did you specify the right host or port?
This I fixed by copying the generated conf to the local kube config.
cp /etc/kubernetes/kubelet.conf ~/.kube/config
There are a couple of possibilities here -:
1) In older kubeadm versions selinux blocks access at this point
2) If you are behind a proxy you will need to add the usual to the kubeadm environment -:
HTTP_PROXY
HTTPS_PROXY
NO_PROXY
Plus, which I have not seen documented anywhere -:
KUBERNETES_HTTP_PROXY
KUBERNETES_HTTPS_PROXY
KUBERNETES_NO_PROXY
.....
<master/apiclient> all control plane components are healthy after 20.585964 seconds
<master/apiclient> waiting for at least one node to register and become ready
<master/apiclient> first node is ready after 8.259447 seconds
<master/apiclient> attempting a test deployment
<master/apiclient> test deployment succeeded
<master/discovery> created essential addon: kube-discovery, waiting for it to become ready
<master/discovery> kube-discovery is ready after 66.415198 seconds
kubeadm: I am an alpha version, my authors welcome your feedback and bug reports
kubeadm: please create an issue using https://github.com/kubernetes/kubernetes/issues/new
kubeadm: and make sure to mention #kubernetes/sig-cluster-lifecycle. Thank you!
failed creating essential kube-proxy addon [Timeout: request did not complete within allowed duration]
Fixed. But I installed and configuregd succefully 1.2.0 version... Oh