Why the core dns doesnt resolve any domain name when firewalld is enabled? It works when firewalld is stopped - kubernetes

I have enabled all required ports. When i enable the firewalld service then the core-dns doesnt resolve any domain-name with command $ kubectl exec -ti busybox -- nslookup kubernetes.default

This seems to be a know case, which you can find on GitHub Fresh deploy with CoreDNS not resolving any dns lookup #1056.
There seems to be few solutions which would mean different problems.
One being:
sudo systemctl stop firewalld
sudo systemctl stop firewalld
Please remember this is not recommended.
Another solution might be:
Adding iptables -p FORWARD ACCEPT.
Also check if core dns daemon controller has enough resources, as this might be causing restarts.
You need to provide more details regarding your cluster so we can pinpoint the issue.

This problem may originate due to forwarding packets between interfaces. There are two options:
First, for sessions, I also recommend this for testing:
$ vim /proc/sys/net/ipv4/ip_forward
# set to 1
For a more permanent solution:
$ vim /etc/sysctl.conf
# ADD net.ipv4.ip_forward=1
$ sudo /sbin/sysctl -p

Related

Kubernetes Service pinging not working time to time "Temporary fail in name resolution"

I have two separate clusters (Application and DB) in the same namespace. Statefulset for DB cluster and Deployment for Application cluster. For internal communication I have configured a Headless Service. When I ping from a pod in application cluster to the service it works (Works the other way round too - DB pod to service works). But sometimes, for example if I continuously execute ping command for like 3 times, the third time it gives an error - "ping: : Temporary failure in name resolution". Why is this happening?
As far as I know this is usually a name resolution error and shows that your DNS server cannot resolve the domain names into their respective IP addresses. This can present a grave challenge as you will not be able to update, upgrade, or even install any software packages on your Linux system. Here I have listed few reasons
1.Forgot configuring or Wrongly Configured resolv.conf File
The /etc/resolv.conf file is the resolver configuration file in Linux systems. It contains the DNS entries that help your Linux system to resolve domain names into IP addresses.
If this file is not present or is there but you are still having the name resolution error, create one and append the Google public DNS server as nameserver 8.8.8.8
Save the changes and restart the systemd-resolved service as shown.
$ sudo systemctl restart systemd-resolved.service
It’s also prudent to check the status of the resolver and ensure that it is active and running as expected:
$ sudo systemctl status systemd-resolved.service
2. Due to Firewall Restrictions
By some chance if the first solution did not work for you, firewall restrictions could be preventing you from successfully performing DNS queries. Check your firewall and confirm if port 53 (used for DNS – Domain Name Resolution ) and port 43 are open. If the ports are blocked, open them as follows:
For UFW firewall (Ubuntu / Debian and Mint)
To open ports 53 & 43 on the UFW firewall run the commands below:
$ sudo ufw allow 43/tcp
$ sudo ufw reload```
For firewalld (RHEL / CentOS / Fedora)
For Redhat based systems such as CentOS, invoke the commands below:
```$ sudo firewall-cmd --add-port=53/tcp --permanent
$ sudo firewall-cmd --add-port=43/tcp --permanent
$ sudo firewall-cmd --reload
I hope that you now have an idea about the ‘temporary failure in name resolution‘ error. I also found a similar git issue hope that helps
https://github.com/kubernetes/kubernetes/issues/6667

After reboot Centos7 server , when run kubectl get pod get error: the connection to the server localhost:8080 was refused

What happened:
when I reboot the centos7 server and run get pod, see below error:
The connection to the server localhost:8080 was refused - did you specify the right host or port? What you expected to happen:
before I reboot the system, the Kubernetes have three nodes, and pods/service/,.. all working fine.
How to reproduce it (as minimally and precisely as possible):
reboot the server
kubectl get pod
Anything else we need to know?
I even used sudo kubeadm reset and init again but the issue still exists!
There are few things to consider:
kubeadm reset performs a best effort revert of changes made by kubeadm init or kubeadm join. So some configurations may stay on the cluster.
Make sure you run kubectl as a proper user. You might need to copy the admin.conf to .kube/config dir of the user's home directory.
After kubeadm init you need to run the following commands:
sudo cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
Make sure you do so.
Check Centos' firewall configuration. After the restart it might go back to defaults.
Please let me know if that helped.

Nginx ingress controller at kubernetes not allowing installation of some package

I am looking to execute
apt install tcpdump
but facing permission denial, upon looking to set the directory to root, it is asking me for password and I don't know from where to get that password.
I installed nginx helm chart from stable/nginx repository with no RBAC
Please see snapshot for details on error, while I tried installing tcpdump in the pod after doing ssh into it.
In Using GDB with Nginx, you can find troubleshooting section:
Shortly:
find the node where your pod is running (kubectl get pods -o wide)
ssh into the node
find the docker_ID for this image (docker ps | grep pod_name)
run docker exec -it --user=0 --privileged docker_ID bash
Note: Runtime privilege and Linux capabilities
When the operator executes docker run --privileged, Docker will enable access to all devices on the host as well as set some configuration in AppArmor or SELinux to allow the container nearly all the same access to the host as processes running outside containers on the host. Additional information about running with --privileged is available on the Docker Blog.
Additional resources:
ROOT IN CONTAINER, ROOT ON HOST
Hope this help.

kubernetes: pods cannot connect to internet

I cannot connect to internet from pods. My kubernetes cluster is behind proxy.
I have already set /env/environment and /etc/systemd/system/docker.service.d/http_proxy.conf, and confirmed that environment variables(http_proxy, https_proxy, HTTP_PROXY, HTTPS_PROXY, no_proxy, NO_PROXY) are correct.
But in the pod, when I tried echo $http_proxy, answer is empty. I also tried curl -I https://rubygems.org but it returned curl: (6) Could not resolve host: rubygems.org.
So I think pod doesn't receive environment values correctly or there is something I forget to do what I should do. How should I do to solve it?
I tried to export http_proxy=http://xx.xx.xxx.xxx:xxxx; export https_proxy=....
After that, I tried again curl -I https://rubygems.org and I can received header with 200.
What I see is that you have wrong proxy.conf name.
As per official documention the name should be /etc/systemd/system/docker.service.d/http-proxy.confand not /etc/systemd/system/docker.service.d/http_proxy.conf.
Next you add proxies, reload daemon and restart docker, as mentioned in provided in comments another answer
/etc/systemd/system/docker.service.d/http_proxy.conf:
Content:
[Service]
Environment="HTTP_PROXY=http://x.x.x:xxxx"
Environment="HTTPS_PROXY=http://x.x.x.x:xxxx"
# systemctl daemon-reload
# systemctl restart docker
Or, as per #mk_ska answer you can
add http_proxy setting to your Docker machine in order to forward
packets from the nested Pod container through the target proxy server.
For Ubuntu based operating system:
Add export http_proxy='http://:' record to the file
/etc/default/docker
For Centos based operating system:
Add export http_proxy='http://:' record to the file
/etc/sysconfig/docker
Afterwards restart Docker service.
Above will set proxy for all containers what will be used by docker engine

L3 miss and Route not Found for flannel

So I have a Kubernetes cluster, and I am using Flannel for an overlay network. It has been working fine (for almost a year actually) then I modified a service to have 2 ports and all of a sudden I get this about a completely different service, one that was working previously and I did not edit:
<Timestamp> <host> flanneld[873]: I0407 18:36:51.705743 00873 vxlan.go:345] L3 miss: <Service's IP>
<Timestamp> <host> flanneld[873]: I0407 18:36:51.705865 00873 vxlan.go:349] Route for <Service's IP> not found
Is there a common cause to this? I am using Kubernetes 1.0.X and Flannel 0.5.5 and I should mention only one node is having this issue, the rest of the nodes are fine. The bad node's kube-proxy is also saying it can't find the service's endpoint.
Sometime flannel will change it's subnet configuration... you can tell this if the IP and MTU from cat /run/flannel/subnet.env doesn't match ps aux | grep docker (or cat /etc/default/docker)... in which case you will need to reconfigure docker to use the new flannel config.
First you have to delete the docker network interface
sudo ip link set dev docker0 down
sudo brctl delbr docker0
Next you have to reconfigure docker to use the new flannel config.
Note: sometimes this step has to be done manually (i.e. read the contents of /run/flannel/subnet.env and then alter /etc/default/docker)
source /run/flannel/subnet.env
echo DOCKER_OPTS=\"-H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU}\" > /etc/default/docker
Finally, restart docker
sudo service docker restart