Redis Sentinel doesn't auto discover new slave instances - kubernetes

I've deployed the redis helm chart on k8s with Sentinel enabled.
I've set up the Master-Replicas with Sentinel topology, it means one master and two slaves. Each pod is running both the redis and sentinel container successfully:
NAME READY STATUS RESTARTS AGE IP NODE
my-redis-pod-0 2/2 Running 0 5d22h 10.244.0.173 node-pool-u
my-redis-pod-1 2/2 Running 0 5d22h 10.244.1.96 node-pool-j
my-redis-pod-2 2/2 Running 0 3d23h 10.244.1.145 node-pool-e
Now, I've a python script that connects to redis and discovers the master by passing it the pod's ip.
sentinel = Sentinel([('10.244.0.173', 26379),
('10.244.1.96',26379),
('10.244.1.145',26379)],
sentinel_kwargs={'password': 'redispswd'})
host, port = sentinel.discover_master('mymaster')
redis_client = StrictRedis(
host=host,
port=port,
password='redispswd')
Let's suposse the master node is on my-redis-pod-0, when I do kubectl delete pod to simulate a problem that leads me to loss the pod, Sentinel will promote one of the others slaves to master and kubernetes will give me a new pod with redis and sentinel.
NAME READY STATUS RESTARTS AGE IP NODE
my-redis-pod-0 2/2 Running 0 3m 10.244.0.27 node-pool-u
my-redis-pod-1 2/2 Running 0 5d22h 10.244.1.96 node-pool-j
my-redis-pod-2 2/2 Running 0 3d23h 10.244.1.145 node-pool-e
The question is, how can I do to tell Sentinel to add this new ip to the list automatically (without code changes)?
Thanks!

Instead of using IPs, you may use the dns entries for a headless service.
A headless service is created by explicitly specifying
ClusterIP: None
Then you will be able to use the dns entries as under, where redis-0 will be the master
#syntax
pod_name.service_name.namespace.svc.cluster.local
#Example
redis-0.redis.redis.svc.cluster.local
redis-1.redis.redis.svc.cluster.local
redis-2.redis.redis.svc.cluster.local
References:
What is a headless service, what does it do/accomplish, and what are some legitimate use cases for it?
https://www.containiq.com/post/deploy-redis-cluster-on-kubernetes

Related

how pods manage the IP address?

I would like to know how exactly pods get an IP address, and how they distribute the pods to agent and master.
I have 1 master node and 2 agent nodes. my pods all are running well, but I am curious how the pods get an IP address.
some pods have IP cluster nodes, meanwhile, some have an ethernet IP address. I run Nginx and Metallb for the load balancer. Disable Traefik and Klipper.
if we can see the agent-03 has 2 IP addresses run on
root:/# kubectl get pods -A -o wide
ingress nginx-dep-fdcd8sdfs-gj5gff 1/1 Running 0 46h 10.42.0.80 master <none> <none>
ingress nginx-dep-fdcd8sdfs-dn80n 1/1 Running 0 46h 10.42.0.79 master <none> <none>
ingress nginx-doc-7cc85c5899-sdh55 1/1 Running 0 44h 10.42.0.82 master <none> <none>
ingress nginx-doc-7cc85c5899-gjghs 1/1 Running 0 44h 10.42.0.83 master <none> <none>
prometheus prometheus-node-exporter-6tl8t 1/1 Running 0 47h 192.168.1.3 agent-03 <none> <none>
ingress ingress-controller-nginx-ingress-controller-rqs8n 1/1 Running 5 47h 192.168.1.3 agent-03 <none> <none>
prometheus prometheus-kube-prometheus-operator-68fbcb6d67-8qsnf 1/1 Running 1 46h 10.42.2.52 agent-03 <none> <none>
ingress nginx-doc-7cc85c5899-b77j6 1/1 Running 0 43h 10.42.2.57 agent-03 <none> <none>
metallb-system speaker-sk4pz 1/1 Running 1 47h 192.168.1.3 agent-03 <none> <none>
in my pod's shows agent-03 run Nginx-doc use IP cluster while metal use IP ethernet, or it depends on what service are running in pods?
ingress nginx-doc-7cc85c5899-b77j6 1/1 Running 0 43h 10.42.2.57 agent-03 <none> <none>
metallb-system speaker-sk4pz 1/1 Running 1 47h 192.168.1.3 agent-03 <none> <none>
and I can see master has 2 Nginx-doc pods running, which means when I deploy 3 Nginx-doc one agent will not get any Nginx-doc because it has been taken by the master. and it is not divided equally.
If I miss configuring which part do I need to fix.
Based on your internal plugin your POD will get the IPs. Which again will be the internal IPs mostly.
There are different types of Network interfaces, we can use CNI as per need : https://kubernetes.io/docs/concepts/cluster-administration/networking/
POD gets exposed by the service. There are different types of services. Cluster IP, Node Port, Load Balancer. https://kubernetes.io/docs/concepts/services-networking/service/
in my pod's shows agent-03 run Nginx-doc use IP cluster while metal
use IP ethernet, or it depends on what service are running in pods?
Could be possible due to the service type you are using due to that IP is different and using ethernet.
If your service type is LoadBalancer using MetalLb which means that the service is exposed using the IP, not like internal IP that PODs have mostly.
kubectl get svc -n <namespace name> and check
and I can see master has 2 Nginx-doc pods running, which means when I
deploy 3 Nginx-doc one agent will not get any Nginx-doc because it has
been taken by the master. and it is not divided equally.
There is no guarantee on that, K8s put and assign pods based on score.
You can read more about score at here : https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/
If you want to fix your POD on a specific node, suppose you are running the GPU with Node your should schedule on that Node to use GPU in that case you can use.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
Pod's IP address is provided by CNI driver from range that was specified when cluster was created using --pod-network-cidr, see here.
Some CNI implementations can implement additional behavior.
In your particular case I believe that pods in question are started using hostNetwork: true in their PodSpec, which gives them access to host network

Autoscaler not scaling up leaving nodes in NotReady state and pods in Unknown state

I am running a cluster on GKE with a single node pool. It has 3 nodes and can scale from 1 to 99 nodes. The cluster uses the nginx-ingress controller
On this cluster, I want to deploy apps. An app is scoped by a namespace and consists of 3 deployments and one ingress (defining paths to access the application from the internet). Each deployment runs a single replica of a container.
Deploying a couple of apps works fine, but deploying many apps (requiring the node pool to scale up) breaks everything:
All pods start having warnings (including those successfully deployed earlier)
kubectl get pods --namespace bcd
NAME READY STATUS RESTARTS AGE
actions-664b7d79f5-7qdkw 1/1 Unknown 1 35m
actions-664b7d79f5-v8s2m 1/1 Running 1 18m
core-85cb74f89b-ns49z 1/1 Unknown 1 35m
core-85cb74f89b-qqzfp 1/1 Running 1 18m
nlu-77899ddbf-8pd7k 1/1 Running 1 27m
All nodes becomes unready:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-clients-projects-default-pool-f9af73d4-gzwr NotReady <none> 42m v1.9.7-gke.6
gke-clients-projects-default-pool-f9af73d4-p5l2 NotReady <none> 21m v1.9.7-gke.6
gke-clients-projects-default-pool-f9af73d4-wnxc NotReady <none> 37m v1.9.7-gke.6
Deleting the namespace to remove all resources from the cluster also seems to fail as after a long while the pods remain active but still in an unknown state.
How can I safely add more apps and let the cluster autoscale?
The reason seems to be that not knowing the resources needed for each pod, the scheduler schedules them on any available node, potentially exhausting available resources and putting the Docker daemon in an inconsistent state.
The solution is to specify resources requests and limits: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-requests-and-limits-of-pod-and-container

Is possible to access to pod information from inside the container?

I have a deployment configured with 5 replicas. I want to know inside each running container the name of the pod replica.
When I execute:
kubectl get pods
NAME READY STATUS RESTARTS AGE
test-581957695-cbjtm 1/1 Running 3 1d
test-581957695-dnv8s 1/1 Running 1 1d
test-581957695-fv467 1/1 Running 1 1d
test-581957695-m74lc 1/1 Running 0 1d
test-581957695-s6cx0 1/1 Running 1 1d
Is it possible to get the name "test-581957695-cbjtm" from inside the container?
Thank you.
You can use env vars to expose pod information to container.
https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
Another option is to use Kubernetes Downward API. Using the API, you can expose pod/container information inside your container via a VolumeFile or Environment Variables.
Currently, these are the pieces of information you can expose:
The node’s name
The Pod’s name
The Pod’s namespace
The Pod’s IP address
The Pod’s service account name
A Container’s CPU limit
A container’s CPU request
A Container’s memory limit
A Container’s memory request
In addition, the following information is available through DownwardAPIVolumeFiles,
The Pod’s labels
The Pod’s annotations
For more information please see, https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#the-downward-api

kubernetes service IPs not reachable

So I've got a Kubernetes cluster up and running using the Kubernetes on CoreOS Manual Installation Guide.
$ kubectl get no
NAME STATUS AGE
coreos-master-1 Ready,SchedulingDisabled 1h
coreos-worker-1 Ready 54m
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default curl-2421989462-h0dr7 1/1 Running 1 53m 10.2.26.4 coreos-worker-1
kube-system busybox 1/1 Running 0 55m 10.2.26.3 coreos-worker-1
kube-system kube-apiserver-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-controller-manager-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
kube-system kube-proxy-coreos-worker-1 1/1 Running 0 58m 192.168.0.204 coreos-worker-1
kube-system kube-scheduler-coreos-master-1 1/1 Running 0 1h 192.168.0.200 coreos-master-1
$ kubectl get svc --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.3.0.1 <none> 443/TCP 1h
As with the guide, I've setup a service network 10.3.0.0/16 and a pod network 10.2.0.0/16. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns: the service IP 10.3.0.1 couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.
From within the curl pod, I can reproduce the issue:
[ root#curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host
[ root#curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0 src 10.2.26.4
It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.
There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:
core#coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.
So here're the questions I have:
What is the 1st IP of the service network (10.3.0.1 in this case) used for?
Is above description of the traffic flow correct? If not, what steps does it take for a container to reach a service IP?
What are the best ways to debug each step in the traffic flow? (I can't get any idea what's wrong from the logs)
Thanks!
The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect ip ro to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (see iptables -L; iptables -t nat -L on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.
The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (kubectl describe svc kubernetes).
Troubleshooting is different on each network/cluster setup. I would generally check:
Is kube-proxy running on each node? On some setups it's run via systemd and on others there is a DeamonSet that schedules a Pod on each node. On your setup it is deployed as static Pods created by the kubelets thrmselves from /etc/kubernetes/manifests/kube-proxy.yaml
Locate logs for kube-proxy and find clues (can you post some?)
Change kube-proxy into userspace mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append --proxy-mode=userspace as a parameter on each node
Is the overlay (pod) network functional?
If you leave comments I will get back to you..
I had this same problem, and the ultimate solution that worked for me was enabling IP forwarding on all nodes in the cluster, which I had neglected to do.
$ sudo sysctl net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
Service IPs and DNS started working immediately afterwards.
I had the same issue, turned out to be a configuration issue in kube-proxy.yaml For the "master" parameter I had the ip address as in - --master=192.168.3.240 but it actually required to be a url like - --master=https://192.168.3.240
FYI my kube-proxy sucessfully uses --proxy-mode=iptables (v1.6.x)

Kubernetes - not unique ip per pod

I'm building a 3 VM (CentOS 7) cluster of Kubernetes 1.3.2.
According to this kubernetes documentation page Networking in Kubernetes: “We give every pod its own IP address” and by that there is no port collision when few pods use the same ports on the same node.
But as seen here, the pods do get the same IP addresses:
[root#gloom kuber-test]# kubectl get pods -o wide -l app=userloc
NAME READY STATUS RESTARTS AGE IP NODE
userloc-dep-857294609-0am9d 1/1 Running 0 27m 172.17.0.5 157.244.150.86
userloc-dep-857294609-a4538 1/1 Running 0 27m 172.17.0.7 157.244.150.96
userloc-dep-857294609-c4wzy 1/1 Running 0 6h 172.17.0.3 157.244.150.86
userloc-dep-857294609-hbl9i 1/1 Running 0 6h 172.17.0.5 157.244.150.96
userloc-dep-857294609-rpgyd 1/1 Running 0 27m 172.17.0.5 157.244.150.198
userloc-dep-857294609-tnnho 1/1 Running 0 6h 172.17.0.3 157.244.150.198
What do I miss?
EDIT - 31/07/16:
Following Sven Walter's comments, maybe the issue is that somehow the IPs which the pods had received are of the docker bridge subnet 172.17.0.0/16 (which is not distinct per node) instead of flannel’s subnets 10.x.x.x/24 (which are distinct per node).
Can this be the issue?
In case needed, here is the deployment yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: userloc-dep
spec:
replicas: 6
template:
metadata:
labels:
app: userloc
spec:
containers:
- name: userloc
image: globe:5000/openlso/userlocation-ms:0.1
ports:
- containerPort: 8081
The issue occured becuase following docker documentation I had added additional docker config in /etc/systemd/system/docker.service.d/docker.conf that overrides the config in /usr/lib/systemd/system/docker.service. Unfortunatelly the scripts I used to setup the cluster (master.sh and worker.sh) doesn't refer to the first file but to the second one.
Once I removed the docker.conf file the pods got flannel’s subnet.
After configuring flannel, assuming you did so correctly, each node will grab a slice of the overall ip network cidr. You can figure out which cidr is assigned to which node by doing an etcd ls -r and looking for a key like "coreos.com". The subnet slices assigned to each node should be unique.
Once a node has a subnet, flannel assigns that cidr to flannel.0 (a vxlan device) and you need to restart docker, eg: https://github.com/coreos/flannel#docker-integration. If you failed to restart docker, or the options are wrong, or flannel isn't running on the node, or non-unique subnets are assigned to different nodes, things won't work as expected. Reply to this if you need more help debugging and we can take it from there.
maybe it can help you, I have the same problem when I had more than one network interface to fix that I defined the network interface that flannel use to communicate with other nodes.
flanneld --iface=enp0s8
in my case, I change that in /etc/sysconfig/flanneld
FLANNEL_ETCD="http://master.gary.local:2379"
FLANNEL_ETCD_KEY="/atomic.io/network"
FLANNEL_OPTIONS="--iface=enp0s8"
after change that obviously you need to restart docker and flanneld daemons.