K8s - Cannot access service/clusterIP from another node - kubernetes

I am trying to access(from worker node) a pod(on worker node) via a service/ClusterIP using curl http://cluster_ip:port_no but it isn't working.
Here's some info on service
masternode#Master:/localdocker$ kubectl describe svc registry
Name: registry
Namespace: default
Labels: io.kompose.service=registry
Annotations: kompose.cmd: kompose convert -f docker-compose.yaml -o localregistry.yaml
kompose.version: 1.1.0 (36652f6)
Selector: io.kompose.service=registry
Type: ClusterIP
IP: 10.100.126.230
Port: 5000 5000/TCP
TargetPort: 5000/TCP
Endpoints: 192.168.171.74:5000
Session Affinity: None
Events: <none>```
here's some info on pod
masternode#Master:/localdocker$ kubectl describe pod registry-7ccd695dc7-69cx4
Name: registry-7ccd695dc7-69cx4
Namespace: default
Priority: 0
Node: worker/10.0.1.5
Start Time: Sun, 19 Jul 2020 06:09:14 +0000
Labels: io.kompose.service=registry
pod-template-hash=7ccd695dc7
Annotations: cni.projectcalico.org/podIP: 192.168.171.74/32
cni.projectcalico.org/podIPs: 192.168.171.74/32
Status: Running
IP: 192.168.171.74
IPs:
IP: 192.168.171.74
Controlled By: ReplicaSet/registry-7ccd695dc7
Containers:
registry:
Container ID: docker://ca372f12ef7a1a3cb23e7d6c58337f47848f91212f6c75af6bfd04bc48ea2f27
Image: registry:2
Image ID: docker-pullable://registry#sha256:8be26f81ffea54106bae012c6f349df70f4d5e7e2ec01b143c46e2c03b9e551d
Port: 5000/TCP
Host Port: 0/TCP
State: Running
Started: Sun, 19 Jul 2020 06:09:24 +0000
Ready: True
Restart Count: 0
Environment:
REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: /data
Mounts:
/data from registry-claim0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-fhf5k (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
registry-claim0:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: registry-claim0
ReadOnly: false
default-token-fhf5k:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-fhf5k
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned default/registry-7ccd695dc7-69cx4 to worker
Normal Pulling 13m kubelet, worker Pulling image "registry:2"
Normal Pulled 13m kubelet, worker Successfully pulled image "registry:2"
Normal Created 13m kubelet, worker Created container registry
Normal Started 13m kubelet, worker Started container registry
This is a practice exercise where I was able to do so(in their live environment) without any NodePorts.
Please let me know if any other info is required.

This is an expected behavior because ClusterIP type service is only accessible from within the kubernetes cluster i.e from another pod etc.
If you want to access a pod via a service from outside the kubernetes cluster i.e from the nodes itself then use NodePort type service.
Once you expose it via NodePort service you would be able to access it using curl http://<NODE-IP>:<NODE-PORT>
ClusterIP is created on service network of the cluster and nodes are in different network. By creating a NodePort service a Port is opened in each nodes network to forward traffic to ClusterIP. So in essence NodePort uses ClusterIP internally and is an higher level abstraction built on top of ClusterIP.

Related

Kubernetes Pod Stuck in Pending Without Indicating Any Reason

We are using client-go to create kubernetes jobs and deployments. Today in one of our cluster (kubernetes v1.18.19), I encounter below weird problem.
Pods of kubernetes Job are always stuck in Pending status, without any reasons. kubectl describe pod shows there are no events. Creating Jobs from host (via kubectl) are normal and pods became running eventually.
What surprises me is Creating Deployments is ok, pods get running eventually!! It won't work only for Kubernetes Jobs. Why? How to fix that?? What I can do?? I have taken hours here but got no progress.
kubeconfig by client-go:
Mount from host machine, path: /root/.kube/config
kubectl describe job shows:
Name: unittest
Namespace: default
Selector: controller-uid=f3cec901-c0f4-4098-86d7-f9a7d1fe6cd1
Labels: job-id=unittest
Annotations: <none>
Parallelism: 1
Completions: 1
Start Time: Sat, 19 Jun 2021 00:20:12 +0800
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=f3cec901-c0f4-4098-86d7-f9a7d1fe6cd1
job-name=unittest
Containers:
unittest:
Image: ubuntu:18.04
Port: <none>
Host Port: <none>
Command:
echo hello
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 21m job-controller Created pod: unittest-tt5b2
Kubectl describe on target pod shows:
Name: unittest-tt5b2
Namespace: default
Priority: 0
Node: <none>
Labels: controller-uid=f3cec901-c0f4-4098-86d7-f9a7d1fe6cd1
job-name=unittest
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: Job/unittest
Containers:
unittest:
Image: ubuntu:18.04
Port: <none>
Host Port: <none>
Command:
echo hello
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-72g27 (ro)
Volumes:
default-token-72g27:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-72g27
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
kubectl get events shows:
55m Normal ScalingReplicaSet deployment/job-scheduler Scaled up replica set job-scheduler-76b7465d74 to 1
19m Normal ScalingReplicaSet deployment/job-scheduler Scaled up replica set job-scheduler-74f8896f48 to 1
58m Normal SuccessfulCreate job/unittest Created pod: unittest-pp665
49m Normal SuccessfulCreate job/unittest Created pod: unittest-xm6ck
17m Normal SuccessfulCreate job/unittest Created pod: unittest-tt5b2
I fixed the issue.
We use a custom scheduler for NPU devices and default scheduler for GPU devices. For GPU devices, the scheduler name is "default-scheduler" other than "default". I passed "default" for those kube Jobs, this causes the pods to stuck in pending.

K8s tutorial fails on my local installation with i/o timeout

I'm working on a local kubernetes installation with three nodes. They are installed via geerlingguy/kubernetes Ansible role (with default settings). I've recreated the whole VMs multiple times. I try to follow the Kubernetes tutorials on https://kubernetes.io/docs/tutorials/kubernetes-basics/explore/explore-interactive/ to get services up and running inside the cluster and try to reach them now.
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
enceladus Ready <none> 162m v1.17.9
mimas Ready <none> 162m v1.17.9
titan Ready master 162m v1.17.9
I tried it with the 1.17.9 or 1.18.6, I tried it with https://github.com/geerlingguy/ansible-role-kubernetes and https://github.com/kubernetes-sigs/kubespray on fresh Debian-Buster VMs. I tried it with Flannel and Calico network plugin. There is no a firewall configured.
I can deploy the kubernetes-bootcamp and exec into it, but when I try to reach the pod via kubectl proxy and curl I'm getting an error.
# kubectl create deployment kubernetes-bootcamp --image=gcr.io/google-samples/kubernetes-bootcamp:v1
# kubectl describe pods
Name: kubernetes-bootcamp-69fbc6f4cf-nq4tj
Namespace: default
Priority: 0
Node: enceladus/192.168.10.12
Start Time: Thu, 06 Aug 2020 10:53:34 +0200
Labels: app=kubernetes-bootcamp
pod-template-hash=69fbc6f4cf
Annotations: <none>
Status: Running
IP: 10.244.1.4
IPs:
IP: 10.244.1.4
Controlled By: ReplicaSet/kubernetes-bootcamp-69fbc6f4cf
Containers:
kubernetes-bootcamp:
Container ID: docker://77eae93ca1e6b574ef7b0623844374a5b2f3054075025492b708b23fc3474a45
Image: gcr.io/google-samples/kubernetes-bootcamp:v1
Image ID: docker-pullable://gcr.io/google-samples/kubernetes-bootcamp#sha256:0d6b8ee63bb57c5f5b6156f446b3bc3b3c143d233037f3a2f00e279c8fcc64af
Port: <none>
Host Port: <none>
State: Running
Started: Thu, 06 Aug 2020 10:53:35 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-kkcvk (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-kkcvk:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-kkcvk
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10s default-scheduler Successfully assigned default/kubernetes-bootcamp-69fbc6f4cf-nq4tj to enceladus
Normal Pulled 9s kubelet, enceladus Container image "gcr.io/google-samples/kubernetes-bootcamp:v1" already present on machine
Normal Created 9s kubelet, enceladus Created container kubernetes-bootcamp
Normal Started 9s kubelet, enceladus Started container kubernetes-bootcamp
Update service list
# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4d20h
I can exec curl inside the deployment. It is running.
# kubectl exec -ti kubernetes-bootcamp-69fbc6f4cf-nq4tj curl http://localhost:8080/
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-69fbc6f4cf-nq4tj | v=1
But, when I try to curl from master node the response is not good:
curl http://localhost:8001/api/v1/namespaces/default/pods/kubernetes-bootcamp-69fbc6f4cf-nq4tj/proxy/
Error trying to reach service: 'dial tcp 10.244.1.4:80: i/o timeout'
The curl itself needs ca. 30sec to return. The version etc. is available. The proxy is running fine.
# curl http://localhost:8001/version
{
"major": "1",
"minor": "17",
"gitVersion": "v1.17.9",
"gitCommit": "4fb7ed12476d57b8437ada90b4f93b17ffaeed99",
"gitTreeState": "clean",
"buildDate": "2020-07-15T16:10:45Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
}
The tutorial shows on kubectl describe pods that the container has open ports (in my case it's <none>):
Port: 8080/TCP
Host Port: 0/TCP
Ok, I than created an apply-file bootcamp.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: kubernetes-bootcamp
spec:
replicas: 1
selector:
matchLabels:
app: kubernetes-bootcamp
template:
metadata:
labels:
app: kubernetes-bootcamp
spec:
containers:
- name: kubernetes-bootcamp
image: gcr.io/google-samples/kubernetes-bootcamp:v1
ports:
- containerPort: 8080
protocol: TCP
I removed the previous deployment
# kubectl delete deployments.apps kubernetes-bootcamp --force
# kubectl apply -f bootcamp.yaml
But after that I'm getting still the same i/o timeout on the new deployment.
So, what is my problem?

microk8s clusterIP service appear to be doing source NAT when the documentation says it should not

I am having a networking issue in Kubernetes.
I am trying to preserve the source IP of incoming requests to a clusterIP service, but I find that the requests appear to be source NAT'd. That is, they carry the IP address of the node as the source IP rather than the IP of the pod making the request. I am following the example for cluster IPs here: https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-type-clusterip
but I find that the behavior of Kubernetes is totally different for me.
The above example has me deploy an echo server which reports the source IP. This is deployed behind a clusterIP service which I request from a separate pod running busybox. The response from the echo server is below:
CLIENT VALUES:
client_address=10.1.36.1
command=GET
real path=/
query=nil
request_version=1.1
request_uri=http://10.152.183.99:8080/
SERVER VALUES:
server_version=nginx: 1.10.0 - lua: 10001
HEADERS RECEIVED:
connection=close
host=10.152.183.99
user-agent=Wget
BODY
The source IP 10.1.36.1 belongs to the node. I expected to see the address of busybox which is 10.1.36.168.
Does anyone know why SNAT would be enabled for a clusterIP? It's really strange to me that this directly contradicts the official documentation. (edited)
All of this is running on the same node. The node is running in iptables mode. I am using microk8s.
My microk8s version:
Client:
Version: v1.2.5
Revision: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
Server:
Version: v1.2.5
Revision: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
Output of kubectl describe service clusterip:
Name: clusterip
Namespace: default
Labels: app=source-ip-app
Annotations: <none>
Selector: app=source-ip-app
Type: ClusterIP
IP: 10.152.183.106
Port: <unset> 80/TCP
TargetPort: 8080/TCP
Endpoints: 10.1.36.225:8080
Session Affinity: None
Events: <none>
Output of kubectl describe pod source-ip-app-7c79c78698-xgd5w:
Name: source-ip-app-7c79c78698-xgd5w
Namespace: default
Priority: 0
Node: riley-virtualbox/10.0.2.15
Start Time: Wed, 12 Feb 2020 09:19:18 -0600
Labels: app=source-ip-app
pod-template-hash=7c79c78698
Annotations: <none>
Status: Running
IP: 10.1.36.225
IPs:
IP: 10.1.36.225
Controlled By: ReplicaSet/source-ip-app-7c79c78698
Containers:
echoserver:
Container ID: containerd://6775c010145d3951d067e3bb062bea9b70d305f96f84aa870963a8b385a4a118
Image: k8s.gcr.io/echoserver:1.4
Image ID: sha256:523cad1a4df732d41406c9de49f932cd60d56ffd50619158a2977fd1066028f9
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 12 Feb 2020 09:19:23 -0600
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-7pszf (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-7pszf:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-7pszf
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/source-ip-app-7c79c78698-xgd5w to riley-virtualbox
Normal Pulled 2m58s kubelet, riley-virtualbox Container image "k8s.gcr.io/echoserver:1.4" already present on machine
Normal Created 2m55s kubelet, riley-virtualbox Created container echoserver
Normal Started 2m54s kubelet, riley-virtualbox Started container echoserver

Kubernetes master doesn't attach FlexVolume

I'm trying to attach the dummy-attachable FlexVolume sample for Kubernetes which seems to initialize normally according to my logs on both the nodes and master:
Loaded volume plugin "flexvolume-k8s/dummy-attachable
But when I try to attach the volume to a pod, the attach method never gets called from the master. The logs from the node read:
flexVolume driver k8s/dummy-attachable: using default GetVolumeName for volume dummy-attachable
operationExecutor.VerifyControllerAttachedVolume started for volume "dummy-attachable"
Operation for "\"flexvolume-k8s/dummy-attachable/dummy-attachable\"" failed. No retries permitted until 2019-04-22 13:42:51.21390334 +0000 UTC m=+4814.674525788 (durationBeforeRetry 500ms). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"dummy-attachable\" (UniqueName: \"flexvolume-k8s/dummy-attachable/dummy-attachable\") pod \"nginx-dummy-attachable\"
Here's how I'm attempting to mount the volume:
apiVersion: v1
kind: Pod
metadata:
name: nginx-dummy-attachable
namespace: default
spec:
containers:
- name: nginx-dummy-attachable
image: nginx
volumeMounts:
- name: dummy-attachable
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: dummy-attachable
flexVolume:
driver: "k8s/dummy-attachable"
Here is the ouput of kubectl describe pod nginx-dummy-attachable:
Name: nginx-dummy-attachable
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: [node id]
Start Time: Wed, 24 Apr 2019 08:03:21 -0400
Labels: <none>
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container nginx-dummy-attachable
Status: Pending
IP:
Containers:
nginx-dummy-attachable:
Container ID:
Image: nginx
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/data from dummy-attachable (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-hcnhj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
dummy-attachable:
Type: FlexVolume (a generic volume resource that is provisioned/attached using an exec based plugin)
Driver: k8s/dummy-attachable
FSType:
SecretRef: nil
ReadOnly: false
Options: map[]
default-token-hcnhj:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-hcnhj
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 41s (x6 over 11m) kubelet, [node id] Unable to mount volumes for pod "nginx-dummy-attachable_default([id])": timeout expired waiting for volumes to attach or mount for pod "default"/"nginx-dummy-attachable". list of unmounted volumes=[dummy-attachable]. list of unattached volumes=[dummy-attachable default-token-hcnhj]
I added debug logging to the FlexVolume, so I was able to verify that the attach method was never called on the master node. I'm not sure what I'm missing here.
I don't know if this matters, but the cluster is being launched with KOPS. I've tried with both k8s 1.11 and 1.14 with no success.
So this is a fun one.
Even though kubelet initializes the FlexVolume plugin on master, kube-controller-manager, which is containerized in KOPs, is the application that's actually responsible for attaching the volume to the pod. KOPs doesn't mount the default plugin directory /usr/libexec/kubernetes/kubelet-plugins/volume/exec into the kube-controller-manager pod, so it doesn't know anything about your FlexVolume plugins on master.
There doesn't appear to be a non-hacky way to do this other than to use a different Kubernetes deployment tool until KOPs addresses this problem.

Kubernetes minikube, cannot expose service on public ip range

So I've been playing around with Minkube.
I've managed to deploy a simple python flask container:
PS C:\Users\Will> kubectl run test-flask-deploy --image
192.168.1.201:5000/test_flask:1
deployment "test-flask-deploy" created
I've also then managed to expose the deployment as a service:
PS C:\Users\Will> kubectl expose deployment/test-flask-deploy --
type="NodePort" --port 8080
service "test-flask-deploy" exposed
In the dashboard I can see that the service has a Cluster IP:
10.0.0.132.
I access the dashboard on a 192.168.xxx.xxx address, so I'm hoping I can expose the service on that external IP.
Any idea how I go about this?
A separate and slightly less important question: I've got minikube talking to a docker registry on my network. If i deploy an image (which has not yet been pulled local to the minikube) the deployment fails, yet when I run the docker pull command on minikube locally, the deployment then succeeds. So minikube is able to pull docker images, but when I deploy an image which is accessible via the registry, yet not pulled locally, it fails. Any thoughts?
EDIT: More detail in response to comment:
PS C:\Users\Will> kubectl describe pod test-flask-deploy
Name: test-flask-deploy-1049547027-rgf7d
Namespace: default
Node: minikube/192.168.99.100
Start Time: Sat, 07 Oct 2017 10:19:58 +0100
Labels: pod-template-hash=1049547027
run=test-flask-deploy
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"test-flask-deploy-1049547027","uid":"b06a14b8-ab40-11e7-9714-080...
Status: Running
IP: 172.17.0.4
Created By: ReplicaSet/test-flask-deploy-1049547027
Controlled By: ReplicaSet/test-flask-deploy-1049547027
Containers:
test-flask-deploy:
Container ID: docker://577e339ce680bc5dd9388293f1f1ea62be59a6acc25be22889310761222c760f
Image: 192.168.1.201:5000/test_flask:1
Image ID: docker-pullable://192.168.1.201:5000/test_flask#sha256:d303ed635888394f69223cc0a66c5778444fd3636dfcde42295fd512be948898
Port: <none>
State: Running
Started: Sat, 07 Oct 2017 10:19:59 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5rrpm (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-5rrpm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5rrpm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events: <none>
First, check the nodeport that is assigned to your service:
$ kubectl get svc test-flask-deploy
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-flask-deploy 10.0.0.76 <nodes> 8080:30341/TCP 4m
Now you should be able to access it on 192.168.xxxx:30341 or whatever your minikubeIP:nodeport is.