Kubernetes + calico + replicaSet - kubernetes

So I found myself in a pretty sticky situation. I'm trying to create a simple replicaSet, but unfortunately I ran into some problems with the calico.
I have 2 VM running on OracleVM. I have them configured to use enp0s8 interface. The IP of the master node is 192.168.56.2 and the worker node's ip is 192.168.56.3
Here is what I'm doing in Kubernetes. First I'm creating the kubernetes master node:
kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=192.168.56.2
after successfuly initialzing I'm running:
export KUBECONFIG=/etc/kubernetes/admin.conf
Now I'm creating the POD network by running:
kubectl apply -f https://docs.projectcalico.org/v3.11/manifests/calico.yaml
after that I'm joining from the worker node successfully. Whenever I start the replica with:
*** edit: I don't have to create the replicaset to obtain the same result of the calico-node creation getting stuck
kubectl create -f replicaset-definition.yml
in which the yml looks like this:
kind: ReplicaSet
metadata:
name: myapp-replicaset
labels:
app: myapp
type: front-end
spec:
template:
metadata:
name: myapp-pod
labels:
app: myapp
type: front-end
spec:
containers:
- name: nginx-container
image: nginx
replicas: 2
selector:
matchLabels:
app: myapp
I'm getting a new calico-node created which eventually will get stuck
calico-node-mcb5g 0/1 Running 6 8m58s
calico-node-t9p5n 1/1 Running 0 12m
If I run
kubectl logs -n kube-system calico-node-mcb5g -f
on it I get the following logs:
2020-03-18 14:45:40.585 [INFO][8] startup.go 275: Using NODENAME environment for node name
2020-03-18 14:45:40.585 [INFO][8] startup.go 287: Determined node name: kubenode1
2020-03-18 14:45:40.587 [INFO][8] k8s.go 228: Using Calico IPAM
2020-03-18 14:45:40.588 [INFO][8] startup.go 319: Checking datastore connection
2020-03-18 14:46:10.589 [INFO][8] startup.go 334: Hit error connecting to datastore - retry error=Get https://10.96.0.1:443/api/v1/nodes/foo: dial tcp 10.96.0.1:443: i/o timeout
2020-03-18 14:46:41.591 [INFO][8] startup.go 334: Hit error connecting to datastore - retry error=Get https://10.96.0.1:443/api/v1/nodes/foo: dial tcp 10.96.0.1:443: i/o timeout
I've tried to configure the calico.yml and added the following line in env:
- name: IP_AUTODETECTION_METHOD
value: "interface=enp0s8"
but the result is still the same.
Thank you so much for reading this and if you have any advice I will be sooo grateful!!!

Ok, so here it goes. What it seemed to be was that calico node crashed because the service CIDR and host CIDR overlaped.
If I initiate the master node with the CIDR changed as:
kubeadm init --pod-network-cidr=20.96.0.0/12 --apiserver-advertise-address=192.168.56.2
works like a charm.
This helped a lot:
Cluster Creation Successful but calico-node-xx pod is in CrashLoopBackOff Status

Related

how to restrict a pod to connect only to 2 pods using networkpolicy and test connection in k8s in simple way?

Do I still need to expose pod via clusterip service?
There are 3 pods - main, front, api. I need to allow ingress+egress connection to main pod only from the pods- api and frontend. I also created service-main - service that exposes main pod on port:80.
I don't know how to test it, tried:
k exec main -it -- sh
netcan -z -v -w 5 service-main 80
and
k exec main -it -- sh
curl front:80
The main.yaml pod:
apiVersion: v1
kind: Pod
metadata:
labels:
app: main
item: c18
name: main
spec:
containers:
- image: busybox
name: main
command:
- /bin/sh
- -c
- sleep 1d
The front.yaml:
apiVersion: v1
kind: Pod
metadata:
labels:
app: front
name: front
spec:
containers:
- image: busybox
name: front
command:
- /bin/sh
- -c
- sleep 1d
The api.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
app: api
name: api
spec:
containers:
- image: busybox
name: api
command:
- /bin/sh
- -c
- sleep 1d
The main-to-front-networkpolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: front-end-policy
spec:
podSelector:
matchLabels:
app: main
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: front
ports:
- port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: front
ports:
- port: 8080
What am I doing wrong? Do I still need to expose main pod via service? But should not network policy take care of this already?
Also, do I need to write containerPort:80 in main pod? How to test connectivity and ensure ingress-egress works only for main pod to api, front pods?
I tried the lab from ckad prep course, it had 2 pods: secure-pod and web-pod. There was issue with connectivity, the solution was to create network policy and test using netcat from inside the web-pod's container:
k exec web-pod -it -- sh
nc -z -v -w 1 secure-service 80
connection open
UPDATE: ideally I want answers to these:
a clear explanation of the diff btw service and networkpolicy.
If both service and netpol exist - what is the order of evaluation that the traffic/request goes thru? It first goes thru netpol then service? Or vice versa?
if I want front and api pods to send/receive traffic to main - do I need separate services exposing front and api pods?
Network policies and services are two different and independent Kubernetes resources.
Service is:
An abstract way to expose an application running on a set of Pods as a network service.
Good explanation from the Kubernetes docs:
Kubernetes Pods are created and destroyed to match the state of your cluster. Pods are nonpermanent resources. If you use a Deployment to run your app, it can create and destroy Pods dynamically.
Each Pod gets its own IP address, however in a Deployment, the set of Pods running in one moment in time could be different from the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them "backends") provides functionality to other Pods (call them "frontends") inside your cluster, how do the frontends find out and keep track of which IP address to connect to, so that the frontend can use the backend part of the workload?
Enter Services.
Also another good explanation in this answer.
For production you should use a workload resources instead of creating pods directly:
Pods are generally not created directly and are created using workload resources. See Working with Pods for more information on how Pods are used with workload resources.
Here are some examples of workload resources that manage one or more Pods:
Deployment
StatefulSet
DaemonSet
And use services to make requests to your application.
Network policies are used to control traffic flow:
If you want to control traffic flow at the IP address or port level (OSI layer 3 or 4), then you might consider using Kubernetes NetworkPolicies for particular applications in your cluster.
Network policies target pods, not services (an abstraction). Check this answer and this one.
Regarding your examples - your network policy is correct (as I tested it below). The problem is that your cluster may not be compatible:
For Network Policies to take effect, your cluster needs to run a network plugin which also enforces them. Project Calico or Cilium are plugins that do so. This is not the default when creating a cluster!
Test on kubeadm cluster with Calico plugin -> I created similar pods as you did, but I changed container part:
spec:
containers:
- name: main
image: nginx
command: ["/bin/sh","-c"]
args: ["sed -i 's/listen .*/listen 8080;/g' /etc/nginx/conf.d/default.conf && exec nginx -g 'daemon off;'"]
ports:
- containerPort: 8080
So NGINX app is available at the 8080 port.
Let's check pods IP:
user#shell:~$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
api 1/1 Running 0 48m 192.168.156.61 example-ubuntu-kubeadm-template-2 <none> <none>
front 1/1 Running 0 48m 192.168.156.56 example-ubuntu-kubeadm-template-2 <none> <none>
main 1/1 Running 0 48m 192.168.156.52 example-ubuntu-kubeadm-template-2 <none> <none>
Let's exec into running main pod and try to make request to the front pod:
root#main:/# curl 192.168.156.61:8080
<!DOCTYPE html>
...
<title>Welcome to nginx!</title>
It is working.
After applying your network policy:
user#shell:~$ kubectl apply -f main-to-front.yaml
networkpolicy.networking.k8s.io/front-end-policy created
user#shell:~$ kubectl exec -it main -- bash
root#main:/# curl 192.168.156.61:8080
...
Not working anymore, so it means that network policy is applied successfully.
Nice option to get more information about applied network policy is to run kubectl describe command:
user#shell:~$ kubectl describe networkpolicy front-end-policy
Name: front-end-policy
Namespace: default
Created on: 2022-01-26 15:17:58 +0000 UTC
Labels: <none>
Annotations: <none>
Spec:
PodSelector: app=main
Allowing ingress traffic:
To Port: 8080/TCP
From:
PodSelector: app=front
Allowing egress traffic:
To Port: 8080/TCP
To:
PodSelector: app=front
Policy Types: Ingress, Egress

Kubernetes clusterIP does not load balance requests [duplicate]

My Environment: Mac dev machine with latest Minikube/Docker
I built (locally) a simple docker image with a simple Django REST API "hello world".I'm running a deployment with 3 replicas. This is my yaml file for defining it:
apiVersion: v1
kind: Service
metadata:
name: myproj-app-service
labels:
app: myproj-be
spec:
type: LoadBalancer
ports:
- port: 8000
selector:
app: myproj-be
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myproj-app-deployment
labels:
app: myproj-be
spec:
replicas: 3
selector:
matchLabels:
app: myproj-be
template:
metadata:
labels:
app: myproj-be
spec:
containers:
- name: myproj-app-server
image: myproj-app-server:4
ports:
- containerPort: 8000
env:
- name: DATABASE_URL
value: postgres://myname:#10.0.2.2:5432/myproj2
- name: REDIS_URL
value: redis://10.0.2.2:6379/1
When I apply this yaml it generates things correctly.
- one deployment
- one service
- three pods
Deployments:
NAME READY UP-TO-DATE AVAILABLE AGE
myproj-app-deployment 3/3 3 3 79m
Services:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 83m
myproj-app-service LoadBalancer 10.96.91.44 <pending> 8000:31559/TCP 79m
Pods:
NAME READY STATUS RESTARTS AGE
myproj-app-deployment-77664b5557-97wkx 1/1 Running 0 48m
myproj-app-deployment-77664b5557-ks7kf 1/1 Running 0 49m
myproj-app-deployment-77664b5557-v9889 1/1 Running 0 49m
The interesting thing is that when I SSH into the Minikube, and hit the service using curl 10.96.91.44:8000 it respects the LoadBalancer type of the service and rotates between all three pods as I hit the endpoints time and again. I can see that in the returned results which I have made sure to include the HOSTNAME of the pod.
However, when I try to access the service from my Hosting Mac -- using kubectl port-forward service/myproj-app-service 8000:8000 -- Every time I hit the endpoint, I get the same pod to respond. It doesn't load balance. I can see that clearly when I kubectl logs -f <pod> to all three pods and only one of them is handling the hits, as the other two are idle...
Is this a kubectl port-forward limitation or issue? or am I missing something greater here?
kubectl port-forward looks up the first Pod from the Service information provided on the command line and forwards directly to a Pod rather than forwarding to the ClusterIP/Service port. The cluster doesn't get a chance to load balance the service like regular service traffic.
The kubernetes API only provides Pod port forward operations (CREATE and GET). Similar API operations don't exist for Service endpoints.
kubectl code
Here's a little bit of the flow from the kubectl code that seems to back that up (I'll just add that Go isn't my primary language)
The portforward.go Complete function is where kubectl portforward does the first look up for a pod from options via AttachablePodForObjectFn:
The AttachablePodForObjectFn is defined as attachablePodForObject in this interface, then here is the attachablePodForObject function.
To my (inexperienced) Go eyes, it appears the attachablePodForObject is the thing kubectl uses to look up a Pod to from a Service defined on the command line.
Then from there on everything deals with filling in the Pod specific PortForwardOptions (which doesn't include a service) and is passed to the kubernetes API.
The reason was that my pods were randomly in a crashing state due to Python *.pyc files that were left in the container. This causes issues when Django is running in a multi-pod Kubernetes deployment. Once I removed this issue and all pods ran successfully, the round-robin started working.

EKS, Windows node. networkPlugin cni failed

Per https://docs.aws.amazon.com/eks/latest/userguide/windows-support.html, I ran the command, eksctl utils install-vpc-controllers --cluster <cluster_name> --approve
My EKS version is v1.16.3. I tries to deploy Windows docker images to a windows node. I got error below.
Warning FailedCreatePodSandBox 31s kubelet, ip-west-2.compute.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "ab8001f7b01f5c154867b7e" network for pod "mrestapi-67fb477548-v4njs": networkPlugin cni failed to set up pod "mrestapi-67fb477548-v4njs_ui" network: failed to parse Kubernetes args: pod does not have label vpc.amazonaws.com/PrivateIPv4Address
$ kubectl logs vpc-resource-controller-645d6696bc-s5rhk -n kube-system
I1010 03:40:29.041761 1 leaderelection.go:185] attempting to acquire leader lease kube-system/vpc-resource-controller...
I1010 03:40:46.453557 1 leaderelection.go:194] successfully acquired lease kube-system/vpc-resource-controller
W1010 23:57:53.972158 1 reflector.go:341] pkg/mod/k8s.io/client-go#v0.0.0-20180910083459-2cefa64ff137/tools/cache/reflector.go:99: watch of *v1.Pod ended with: too old resource version: 1480444 (1515040)
It complains too old resource version. How do I upgrade the version?
I removed the windows nodes, re-created windows nodes with different instance type. But, it did not work.
Removed windows nodes group, re-created windows nodes group. It did not work.
Finally, I removed entire EKS cluster, re-created eks cluster. The command, kubectl describe node <windows_node> gives me the output below.
vpc.amazonaws.com/CIDRBlock 0 0
vpc.amazonaws.com/ENI 0 0
vpc.amazonaws.com/PrivateIPv4Address 1 1
Deployed windows-server-iis.yaml. It works as expected. The root cause of the problem is mystery.
To troubleshoot this I would...
First list the components to make sure they're running:
$kubectl get pod -n kube-system | grep vpc
vpc-admission-webhook-deployment-7f67d7b49-wgzbg 1/1 Running 0 38h
vpc-resource-controller-595bfc9d98-4mb2g 1/1 Running 0 29
If they are running check their logs
kubectl logs <vpc-yadayada> -n kube-system
Make sure the instance type you are using has enough available IPs per ENI because in the Windows world only one ENI is used and is limited to the max available IP's per ENI minus one for the Primary IP address. I have run into this error before where I have exceeded the number of IP's available to my ENI.
Confirm that the selector of your pod is right
nodeSelector:
kubernetes.io/os: windows
kubernetes.io/arch: amd64
As an anecdote, I have done the steps mentioned under the To enable Windows support for your cluster with a macOS or Linux client section of the doc you linked on a few clusters to date, and they have worked well.
What is your output for
kubectl describe node <windows_node>
?
if it's like :
vpc.amazonaws.com/CIDRBlock: 0
vpc.amazonaws.com/ENI: 0
vpc.amazonaws.com/PrivateIPv4Address: 0
then you need to re-create the nodegroup with different instance type...
then try to deploy this :
apiVersion: apps/v1
kind: Deployment
metadata:
name: windows-server-iis-test
namespace: default
spec:
selector:
matchLabels:
app: windows-server-iis-test
tier: backend
track: stable
replicas: 1
template:
metadata:
labels:
app: windows-server-iis-test
tier: backend
track: stable
spec:
containers:
- name: windows-server-iis-test
image: mcr.microsoft.com/windows/servercore:1809
ports:
- name: http
containerPort: 80
imagePullPolicy: IfNotPresent
command:
- powershell.exe
- -command
- "Add-WindowsFeature Web-Server; Invoke-WebRequest -UseBasicParsing -Uri 'https://dotnetbinaries.blob.core.windows.net/servicemonitor/2.0.1.6/ServiceMonitor.exe' -OutFile 'C:\\ServiceMonitor.exe'; echo '<html><body><br/><br/><marquee><H1>Hello EKS!!!<H1><marquee></body><html>' > C:\\inetpub\\wwwroot\\default.html; C:\\ServiceMonitor.exe 'w3svc'; "
resources:
limits:
cpu: 256m
memory: 256Mi
requests:
cpu: 128m
memory: 100Mi
nodeSelector:
kubernetes.io/os: windows
---
apiVersion: v1
kind: Service
metadata:
name: windows-server-iis-test
namespace: default
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: windows-server-iis-test
tier: backend
track: stable
sessionAffinity: None
type: ClusterIP
kubectl proxy
open browser http://localhost:8001/api/v1/namespaces/default/services/http:windows-server-iis-test:80/proxy/default.html will shown webpage with Hello EKS text

Command terminated with exit code 7

I have 3 node K8S cluster, I have created 3 replica pods on which application - app1 is running on all the pods, I have established service by running service yaml file and I can see my cluster-Ip created by running kubectl get service
When I try to do curl from one of the node I am getting " curl: (7) Failed to connect - failed to connect"
when I try to curl inside the pod I am getting ... "command terminated with exit code 7"
Commands Ran:
kubectl run kubia --image=kubia --port=8080 --generator=run/v1
kubectl scale rc kubia --replicas=3
Manifest file used:
apiVersion: v1
kind: Service
metadata:
name: kubia
spec:
ports:
- port: 80
targetPort: 8080.
selector:
app: kubia
Can any body help me on this.
Thanks
Solution : In yaml file - selector should be run: kubia instead of app: kubia, deleted the old service and again created new service , I am able to do curl on the internal ip from the pod. Thanks.

Kubeadm join fail. Is my master cluster IP 192.168.0.9 or 10.96.0.1?

When I run the kubeadm token create --print-join-command, I get this:
"kubeadm join 192.168.0.9:6443 --token ff9ega.4ad2z5yn2gicfvmc --discovery-token-ca-cert-hash sha256:66884e1573b3aa1644ba5c724a53703d2c497f9c0e9131325866057937e8c154"
When I run that join command on my node, I get this error:
[discovery] Trying to connect to API Server "192.168.0.9:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.0.9:6443"
[discovery] Failed to request cluster info, will try again: [Get https://192.168.0.9:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 192.168.0.9:6443: i/o timeout]
When I run a kubectl get svc kubernetes -o yaml
I get this, showing a cluster IP of 10.96.0.1:
"apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-02-07T00:44:45Z"
labels:
component: apiserver
provider: kubernetes
name: kubernetes
namespace: default
resourceVersion: "6"
selfLink: /api/v1/namespaces/default/services/kubernetes
uid: 833b1756-2a71-11e9-9ef2-fa163ec9e592
spec:
clusterIP: 10.96.0.1
ports:
- name: https
port: 443
protocol: TCP
targetPort: 6443
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}"
10.96.0. 1 is api server container ip address which is rout able inside k8s cluster. The other ip you mentioned that is 192.168.0.9 is the master server ip address. Ensure nodes can reach the master server before you run join command
You can advertise that IP address which is accessible from your nodes.
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.0.9
If you have executed kubeadm init. You can revert any changes made by kubeadm init or kubeadm join.
kubeadm reset
After this, you can run it again.
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.0.9