KubeDNS namespace lookups failing - kubernetes

Stack
Environment: Azure
Type of install: Custom
Base OS: Centos 7.3
Docker: 1.12.5
The first thing I will say is that I have this same install working in AWS with the same configuration files for apiserver, manager, scheduler, kubelet, and kube-proxy.
Here is the kubelet config:
/usr/bin/kubelet \
--require-kubeconfig \
--allow-privileged=true \
--cluster-dns=10.32.0.10 \
--container-runtime=docker \
--docker=unix:///var/run/docker.sock \
--network-plugin=kubenet \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--serialize-image-pulls=true \
--cgroup-root=/ \
--system-container=/system \
--node-status-update-frequency=4s \
--tls-cert-file=/var/lib/kubernetes/kubernetes.pem \
--tls-private-key-file=/var/lib/kubernetes/kubernetes-key.pem \
--v=2
Kube-proxy config:
/usr/bin/kube-proxy \
--master=https://10.240.0.6:6443 \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--proxy-mode=iptables \
--v=2
Behavior:
Login to any of the pods on any node:
nslookup kubernetes 10.32.0.10
Server: 10.32.0.10
Address 1: 10.32.0.10 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'kubernetes': Try again
What does work is:
nslookup kubernetes.default.svc.cluster.local. 10.32.0.10
Server: 10.32.0.10
Address 1: 10.32.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default.svc.cluster.local.
Address 1: 10.32.0.1 kubernetes.default.svc.cluster.local
So I figured out that on azure, the resolv.conf looked like this:
; generated by /usr/sbin/dhclient-script
search ssnci0siiuyebf1tqq5j1a1cyd.bx.internal.cloudapp.net
10.32.0.10
options ndots:5
If I added the search domains of default.svc.cluster.local svc.cluster.local cluster.local.
Everything started working and I understand why.
However, this is problematic because for every namespace I create, I would need to manage the resolv.conf.
This does not happen when I deploy in Amazon so I am kind of stumped on why it is happening in Azure.

Kubelet has a command line flag, cluster-domain which it looks like you're missing. See the docs
Add --cluster-domain=cluster.local to your kubelet command start up, and it should start working as expected.

Related

Getting patched (no change) on Kubernetes (AWS EKS) node patching

My goal to override the default Kubelet configuration in the running cluster
"imageGCHighThresholdPercent": 85,
"imageGCLowThresholdPercent": 80,
to
"imageGCHighThresholdPercent": 60,
"imageGCLowThresholdPercent": 40,
The possible option is to apply the node patch for each node.
I'm using the following command to get the kubelet config via kubeclt proxy
curl -sSL "http://localhost:8001/api/v1/nodes/ip-172-31-20-135.eu-west-1.compute.internal/proxy/configz" | python3 -m json.tool
The output is
{
"kubeletconfig": {
....
"imageGCHighThresholdPercent": 85,
"imageGCLowThresholdPercent": 80,
.....
}
}
here is the command I'm using to update these two values
kubectl patch node ip-172-31-20-135.eu-west-1.compute.internal -p '{"kubeletconfig":{"imageGCHighThresholdPercent":60,"imageGCLowThresholdPercent":40}}'
Unfortunately the kubeclt returns me
node/ip-172-31-20-135.eu-west-1.compute.internal patched (no change)
As a result the change has no effect.
Any thought what I'm doing wrong.
Thanks
Patching node object is not woking because those configurations are not part of node object.
The way to achieve this would be by updating the kubelet config file in the kubernetes nodes and restarting kubelet process. systemctl status kubelet should tell if kubelet was started with a config file and the location of the file.
root#kind-control-plane:/var/lib/kubelet# systemctl status kubelet
kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/kind/systemd/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Tue 2020-04-14 08:43:14 UTC; 2 days ago
Docs: http://kubernetes.io/docs/
Main PID: 639 (kubelet)
Tasks: 20 (limit: 2346)
Memory: 59.6M
CGroup: /docker/f01f57e1ef7aa7a1a8197e0e79be15415c580da33a7d048512e22418a88e0317/system.slice/kubelet.service
└─639 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --c
ontainer-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false --node-ip=172.17.0.2 --fail-swap-on=false
As it can seen above in a cluster setup by kubeadm kubelet was started with a config file located at /var/lib/kubelet/config.yaml
Edit the configfile to add
ImageGCHighThresholdPercent: 60
ImageGCHighThresholdPercent: 40
Restart kubelet using systemctl restart kubelet.service
In case the cluster was not started with a kubelet config file then create a new config file and pass the config file while starting kubelet.
While you are using EKS you have to configure An Amazon Machine Image (AMI) provides the information required to launch an instance. You must specify an AMI when you launch an instance. You can launch multiple instances from a single AMI when you need multiple instances with the same configuration. You can use different AMIs to launch instances when you need instances with different configurations.
First create folder /var/lib/kubelet, and put kubeconfig template file into it, the content as below:
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: CERTIFICATE_AUTHORITY_FILE
server: MASTER_ENDPOINT
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubelet
name: kubelet
current-context: kubelet
users:
- name: kubelet
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: /usr/bin/heptio-authenticator-aws
args:
- "token"
- "-i"
- "CLUSTER_NAME"
Then create template file /etc/systemd/system/kubelet.service, the content as below:
[Unit]
Description=Kubernetes Kubelet
Documentation=[https://github.com/kubernetes/kubernetes](https://github.com/kubernetes/kubernetes)
After=docker.service
Requires=docker.service
[Service]
ExecStart=/usr/bin/kubelet \
--address=0.0.0.0 \
--authentication-token-webhook \
--authorization-mode=Webhook \
--allow-privileged=true \
--cloud-provider=aws \
--cluster-dns=DNS_CLUSTER_IP \
--cluster-domain=cluster.local \
--cni-bin-dir=/opt/cni/bin \
--cni-conf-dir=/etc/cni/net.d \
--container-runtime=docker \
--max-pods=MAX_PODS \
--node-ip=INTERNAL_IP \
--network-plugin=cni \
--pod-infra-container-image=602401143452.dkr.ecr.REGION.amazonaws.com/eks/pause-amd64:3.1 \
--cgroup-driver=cgroupfs \
--register-node=true \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--feature-gates=RotateKubeletServerCertificate=true \
--anonymous-auth=false \
--client-ca-file=CLIENT_CA_FILE \
--image-gc-high-threshold=60 \
--image-gc-low-threshold=40
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
You have to add flags image-gc-high-threshold and image-gc-low-threshold and specify proper values.
--image-gc-high-threshold int32 The percent of disk usage after which image garbage collection is always run. (default 85)
--image-gc-low-threshold int32 The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. (default 80)
Please take a look: eks-worker-node-ami.

Google cloud kubernetes unable to connect to cluster

I'm getting Unable to connect to the server: dial tcp <IP> i/o timeout when trying to run kubectl get pods when connected to my cluster in google shell. This started out of the blue without me doing any changes to my cluster setup.
gcloud beta container clusters create tia-test-cluster \
--create-subnetwork name=my-cluster\
--enable-ip-alias \
--enable-private-nodes \
--master-ipv4-cidr <IP> \
--enable-master-authorized-networks \
--master-authorized-networks <IP> \
--no-enable-basic-auth \
--no-issue-client-certificate \
--cluster-version=1.11.2-gke.18 \
--region=europe-north1 \
--metadata disable-legacy-endpoints=true \
--enable-stackdriver-kubernetes \
--enable-autoupgrade
This is the current cluster-config.
I've run gcloud container clusters get-credentials my-cluster --zone europe-north1-a --project <my project> before doing this aswell.
I also noticed that my compute instances have lost their external IPs. In our staging environment, everything works as it should based on the same config.
Any pointers would be greatly appreciated.
From what I can see of what you've posted you've turned on master authorized networks for the network <IP>.
If the IP address of the Google Cloud Shell ever changes that is the exact error that you would expect.
As per https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#cloud_shell: you need to update the allowed IP address.
gcloud container clusters update tia-test-cluster \
--region europe-north1 \
--enable-master-authorized-networks \
--master-authorized-networks [EXISTING_AUTH_NETS],[SHELL_IP]/32

How to access kubernetes keys in etcd

Question
How to get the Kubernetes related keys from etcd? Tried to list keys in etcd but could not see related keys. Also where is etcdctl installed?
$ etcdctl
bash: etcdctl: command not found..
$ sudo netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 386/etcd
tcp 0 0 127.0.0.1:2380 0.0.0.0:* LISTEN 386/etcd
$ curl -s http://localhost:2379/v2/keys | python -m json.tool
{
"action": "get",
"node": {
"dir": true
}
}
Background
Installed Kubernetes 1.8.5 by following Using kubeadm to Create a Cluster on CentOS 7. When I looked at Getting started with etcd, v2/keys looks to be the end point.
Usually you need to get etcdctl by yourself. Just download the latest etcdctl archive from etcd releases page.
Also, starting from Kubernetes version 1.6 it uses etcd version 3, so to get a list of all keys is:
ETCDCTL_API=3 etcdctl --endpoints=<etcd_ip>:2379 get / --prefix --keys-only
You can find all etcdctl v3 actions using:
ETCDCTL_API=3 etcdctl --endpoints=<etcd_ip>:2379 --help
EDIT (thanks to #leodotcloud):
In case ETCD is configured with TLS certificates support:
ETCDCTL_API=3 etcdctl --endpoints <etcd_ip>:2379 --cacert <ca_cert_path> --cert <cert_path> --key <cert_key_path> get / --prefix --keys-only
Access the docker container, and run the following commmand:
ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key get / --prefix --keys-only
For Minikube
(v1.17.0)
You can see the arguments exploring the pod: kubectl describe pod -n kube-system etcd-PODNAME |less
Here you can see the certificates path and much more.
To fastly query your etcd dictionary you can use this alias:
alias etcdctl_mini="MY_IP=$(hostname -I |awk '{print $1}'|tr -d ' '); \
ETCDCTL_API=3; \
sudo -E etcdctl --endpoints ${MY_IP}:2379 \
--cacert='/var/lib/minikube/certs/etcd/ca.crt' \
--cert='/var/lib/minikube/certs/etcd/peer.crt' \
--key='/var/lib/minikube/certs/etcd/peer.key'"
$ etcdctl_mini put foo bar
I needed to use etcdctl with etcd installed on CoreOS (Container Linux).
In my case the following worked (executed from CoreOS shell prompt):
$ sudo ETCDCTL_API=3 etcdctl --cacert /etc/ssl/etcd/etcd/peer-ca.crt --cert /etc/ssl/etcd/etcd/peer.crt --key /etc/ssl/etcd/etcd/peer.key get --prefix / --keys-only
I used sudo as a quick solution to the permission problem "Error: open /etc/ssl/etcd/etcd/peer.crt: permission denied".
You can also try following (assuming etcd pod name is etcd-minikube).
Minikube access using etcdctl was already explained above.
$kubectl -it exec etcd-minikube -n kube-system -- etcdctl --cacert='/var/lib/minikube/certs/etcd/ca.crt' --cert='/var/lib/minikube/certs/etcd/peer.crt' --key='/var/lib/minikube/certs/etcd/peer.key' put foo bar
OK
$kubectl -it exec etcd-minikube -n kube-system -- etcdctl --cacert='/var/lib/minikube/certs/etcd/ca.crt' --cert='/var/lib/minikube/certs/etcd/peer.crt' --key='/var/lib/minikube/certs/etcd/peer.key' get foo
foo
bar

Running Kubernetes in multimaster mode

I have set a kubernetes (version 1.6.1) cluster with three servers in control plane.
Apiserver is running with the following config:
/usr/bin/kube-apiserver \
--admission-control=NamespaceLifecycle,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota \
--advertise-address=x.x.x.x \
--allow-privileged=true \
--audit-log-path=/var/lib/k8saudit.log \
--authorization-mode=ABAC \
--authorization-policy-file=/var/lib/kubernetes/authorization-policy.jsonl \
--bind-address=0.0.0.0 \
--etcd-servers=https://kube1:2379,https://kube2:2379,https://kube3:2379 \
--etcd-cafile=/etc/etcd/ca.pem \
--event-ttl=1h \
--insecure-bind-address=0.0.0.0 \
--kubelet-certificate-authority=/var/lib/kubernetes/ca.pem \
--kubelet-client-certificate=/var/lib/kubernetes/kubernetes.pem \
--kubelet-client-key=/var/lib/kubernetes/kubernetes-key.pem \
--kubelet-https=true \
--service-account-key-file=/var/lib/kubernetes/ca-key.pem \
--service-cluster-ip-range=10.32.0.0/24 \
--service-node-port-range=30000-32767 \
--tls-cert-file=/var/lib/kubernetes/kubernetes.pem \
--tls-private-key-file=/var/lib/kubernetes/kubernetes-key.pem \
--token-auth-file=/var/lib/kubernetes/token.csv \
--v=2 \
--apiserver-count=3 \
--storage-backend=etcd2
Now I am running kubelet with following config:
/usr/bin/kubelet \
--api-servers=https://kube1:6443,https://kube2:6443,https://kube3:6443 \
--allow-privileged=true \
--cluster-dns=10.32.0.10 \
--cluster-domain=cluster.local \
--container-runtime=docker \
--network-plugin=kubenet \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--serialize-image-pulls=false \
--register-node=true \
--cert-dir=/var/lib/kubelet \
--tls-cert-file=/var/lib/kubernetes/kubelet.pem \
--tls-private-key-file=/var/lib/kubernetes/kubelet-key.pem \
--hostname-override=node1 \
--v=2
This works great as long as kube1 is running. If I take kube1 down, the node does not communicate with kube2 or kube3. It always takes up the first apiserver passed to the --api-servers flag and does not failover in case the first apiserver crashes.
What is the correct way to do a failover in case one of the apiserver fails?
The --api-servers flag is deprecated. It's no longer in the documentation. kubeconfig is the brand new way to point kubelet to kube-apiserver.
The kosher way to do this today is to deploy a Pod with nginx on each worker node (ie. the ones running kubelet) that load-balances between the 3 kube-apiservers. nginx will know when one master goes down and not route traffic to it; that's its job. The kubespray project uses this method.
The 2nd, not so good way, is to use DNS RR. Create a DNS "A" record for the IPs of the 3 masters. Point kubelet to this RR hostname instead of the 3x IPs. Each time kubelet contacts a master, it will be routed to the IP in the RR list. This technique isn't robust because traffic will still be routed to the downed node, so the cluster will experience intermittent outage.
The 3rd, and more complex method imho, is to use keepalived. keepalived uses VRRP to ensure that at least one node owns the Virtual IP (VIP). If a master goes down, another master will hijack the VIP to ensure continuity. The bad thing about this method is that load-balancing doesn't come as a default. All traffic will be routed to 1 master (ie. the primary VRRP node) until it goes down. Then the secondary VRRP node will take over. You can see the nice write-up I contributed at this page :)
More details about kube-apiserver HA here. Good luck!
For the moment, until 1.8, the best solution seems to be using a load-balancer, as already suggested.
See https://github.com/sipb/homeworld/issues/10.

Expose kubernetes master service to host

I'm trying my hands on kubernetes and come across very basic question. I have setup single node kubernetes on ubuntu running in VirtualBox.
This is exactly what I have. My vagrant file is something like this (so on my mac I can have virtualbox running ubuntu)-
Vagrant.configure("2") do |config|
config.vm.synced_folder ".", "/vagrant"
config.vm.define "app" do |d|
d.vm.box = "ubuntu/trusty64"
d.vm.hostname = "kubernetes"
# Create a private network, which allows host-only access to the machine
# using a specific IP.
d.vm.network "private_network", ip: "192.168.20.10"
d.vm.provision "docker"
end
end
And to start the master I have init.sh something like this-
docker run --net=host -d gcr.io/google_containers/etcd:2.0.9 /usr/local/bin/etcd --addr=127.0.0.1:4001 --bind-addr=0.0.0.0:4001 --data-dir=/var/etcd/data
docker run --net=host -d -v /var/run/docker.sock:/var/run/docker.sock \
gcr.io/google_containers/hyperkube:v0.18.2 /hyperkube kubelet \
--api_servers=http://localhost:8080 \
--v=2 \
--address=0.0.0.0 \
--enable_server \
--hostname_override=127.0.0.1 \
--config=/etc/kubernetes/manifests
docker run -d --net=host --privileged gcr.io/google_containers/hyperkube:v0.18.2 /hyperkube proxy --master=http://127.0.0.1:8080 --v=2
wget http://storage.googleapis.com/kubernetes-release/release/v0.19.0/bin/linux/amd64/kubectl
sudo chmod +x ./kubectl
This brings up simple kubernetes running in vm. Now I can see kubernetes services running if I get services using kubectl-
kubernetes component=apiserver,provider=kubernetes <none> 10.0.0.2 443/TCP
kubernetes-ro component=apiserver,provider=kubernetes <none> 10.0.0.1 80/TCP
I can curl in ssh to 10.0.0.1 and see the result. But My question is how can I expose this kubernetes master service to host machine or when I deploy this thing on server, how can I make this master service available to public ip ?
To expose Kubernetes to the host machine, make sure you exposing the container ports to ubuntu, using the -p option in docker run. Then you should be able to access kubernetes like it was running on the ubuntu box, if you want it to be as if it were running on the host, then port forward the ubuntu ports to your host system. For deployment to servers there are many ways to do this, gce has it's own container engine backed by kubernetes in alpha/beta right now. Otherwise, if you want to deploy with the exact same system, most likely you'll just need the right vagrant provider and ubuntu box and everything should be the same as your local setup otherwise.