I'm new to k8s and need some direction on how to troubleshoot.
I have a postgres container and a graphql container. The graphql container is tries to connect to postgres on startup.
Problem
The graphql container can't connect to postgres. This is the error on startup:
{"internal":"could not connect to server: Connection refused\n\tIs the server running on host "my-app" (xxx.xx.xx.xxx) and accepting\n\tTCP/IP connections on port 5432?\n",
"path":"$","error":"connection error","code":"postgres-error"}
My understanding is that the graphql-container doesn't recognize the IP my-app (xxx.xx.xx.xxx). This is the actual Pod Host IP, so I'm confused as to why it doesn't recognize it. How do I troubleshoot errors like these?
What I tried
Hardcoding the host in the connection uri in deployment.yaml to the actual pod host IP. Same error.
Bashed into the graphql container and verified that it had the correct env values with the env command.
deployment.yaml
spec:
selector:
matchLabels:
service: my-app
template:
metadata:
labels:
service: my-app
...
- name: my-graphql-container
image: image-name:latest
env:
- name: MY_POSTGRES_HOST
value: my-app
- name: MY_DATABASE
value: db
- name: MY_POSTGRES_DB_URL # the postgres connection url that the graphql container uses
value: postgres://$(user):$(pw)#$(MY_POSTGRES_HOST):5432/$(MY_DATABASE)
...
- name: my-postgres-db
image: image-name:latest
In k8s docs about pods you can read:
Pods in a Kubernetes cluster are used in two main ways:
Pods that run a single container. [...]
Pods that run multiple containers that need to work together. [...]
Note: Grouping multiple co-located and co-managed containers in a
single Pod is a relatively advanced use case. You should use this
pattern only in specific instances in which your containers are
tightly coupled.
Each Pod is meant to run a single instance of a given
application. [...]
Notice that your deployment doesn't fit this descriprion because you are trying to run two applications in one pod.
Remember to always use one pod per container and only use multiple containers per pod if it's impossible to separate them (and for some reason they have to run together).
And the rest was already mentioned by David.
Related
I am trying to deploy mongodb with kubernetes (gke more precisely). This database is used by a microservice which only needs to read in the database, so I thought of deploying multiple pods with a mongodb docker in each of them, so that the work is shared between them. To do that I created a mongodb image in which I uploaded my mongodb that I previously used with a single docker.
(Here is its Dockerfile, a single deployment of this image works in k8s so I guess that this may not be linked to the issue)
FROM mongo:latest
EXPOSE 27017
COPY /mdb/ /data/db
As the number of requests to the db varies during the day, I want to use gke horizontal autoscaling for those "mongodb pods". Autoscaling works as new pods are created when the cpu utilization goes over the target I fixed in my horinzontal pod autoscaler, but these new pods are not used by the service I created for the deployment file used to deploy those pods, and that's my issue.
Something strange to me is that the new pods local IP addresses appear in my service endpoints, and when I delete the initial pod, which is the only one working at that moment, then the other pods created by the autoscaler get activated and so I finally get a performance improvement. However, this is obviously not a solution to me, and moreover other pods created after I deleted the initial one don't get used either.
Here are the yamls files for my mongodb deployment and service :
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb-deployment
labels:
app: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: "$my_mongodb_image_in_which_I_have_my_db"
ports:
- containerPort: 27017
resources:
requests:
memory: "1800Mi"
cpu: "3000m"
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
spec:
type: LoadBalancer
loadBalancerIP: $IP_reserved_for_this_service
selector:
app: mongodb
ports:
- protocol: TCP
port: 80
targetPort: 27017
And I am accessing those mongodb through pymongo, in programs that run in another pod in the same gke cluster:
def get_db(database: str):
client = MongoClient(host="$IP_reserved_for_this_service",
port=80,
username="...",
password="...",
authSource="admin")
return client.get_database(database)
This way of using and autoscaling mongodb might be weird and quite impractical but it's only a first model for me and I would like to make it work (or understand why it can't work).
Here are screenshots showing the behaviour of those pods:
state 1: only the initial pod is working
...but all ips appear in service endpoints
state 2: initial pod deleted, the other work now (except the new one created by the autoscaler after the deletion)
...and the endpoints are updated in the service (the update is in the "+ 1 more ...", I checked in the google console)
I feel that the problem might come either from the configuration of my mongodb-service or from the way k8s or gke deals with mongodb images (anyway since I'm new to k8s I might be completely wrong on that too).
Any help or comment will be appreciated, and if you need more information let me know.
There sticky connection in Kubernetes, a common and well-known feature of Kubernetes. Kubernetes doesn't balance packages, it's balancing connections. Once your app established a connection to a service, all requests to this service will through this connection. Kubernetes doesn't guarantee that the next one won't go via another connection.
Once of the option to solve - headless service https://kubernetes.io/docs/concepts/services-networking/service/#headless-services
Or service mesh, but it's too much for your case ))
In Google Cloud Platform, my goal is to have one cluster with a message queue and a pod to consume these in another cluster with MCS (Multi Cluster Service). When trying this out with only one cluster it went fairly smooth. I used the container name with the port number as the endpoint to connect to the redpanda message queue like this:
Now I want to do this between two clusters, but I'm having trouble configuring stuff right. This is my setup:
I followed this guide to set the clusters up which seemed to work (hard to tell, but no errors), and the redpanda application inside the pod is configured to be on localhost:9092. Unfortunately, I'm getting a Connection Error when running the consumer on my-service-export.my-ns.svc.clusterset.local:9092.
Is it correct to expose the pod with the message queue on its localhost?
Are there ways I can debug or test the connection between pods easier?
Ok, got it working. I obviously misread the setup at one point and had to re-do some stuff to get it working.
Also, the my-service-export should probably have the same name as the service you want to export, in my case redpanda.
A helpful tool to check the connection without running up a consumer is the simple dnsutils image. Use this deployment file and change the namespace to my-ns:
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: my-ns # <----
spec:
containers:
- name: dnsutils
image: k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
Then spin it up with apply, exec into it, then run host my-service-export.my-ns.svc.clusterset.local. If you get an IP back you are probably good.
I've deployed an docker registry inside my kubernetes:
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
registry-docker-registry ClusterIP 10.43.39.81 <none> 443/TCP 162m
I'm able to pull images from my machine (service is exposed via an ingress rule):
$ docker pull registry-docker-registry.registry/skaffold-covid-backend:c5dfd81-dirty#sha256:76312ebc62c4b3dd61b4451fe01b1ecd2e6b03a2b3146c7f25df3d3cfb4512cd
...
Status: Downloaded newer image for registry-do...
When I'm trying to test it in order to deploy my image into the same kubernetes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: covid-backend
namespace: skaffold
spec:
replicas: 3
selector:
matchLabels:
app: covid-backend
template:
metadata:
labels:
app: covid-backend
spec:
containers:
- image: registry-docker-registry.registry/skaffold-covid-backend:c5dfd81-dirty#sha256:76312ebc62c4b3dd61b4451fe01b1ecd2e6b03a2b3146c7f25df3d3cfb4512cd
name: covid-backend
ports:
- containerPort: 8080
Then, I've tried to deploy it:
$ cat pod.yaml | kubectl apply -f -
However, kubernetes isn't able to reach registry:
Extract of kubectl get events:
6s Normal Pulling pod/covid-backend-774bd78db5-89vt9 Pulling image "registry-docker-registry.registry/skaffold-covid-backend:c5dfd81-dirty#sha256:76312ebc62c4b3dd61b4451fe01b1ecd2e6b03a2b3146c7f25df3d3cfb4512cd"
1s Warning Failed pod/covid-backend-774bd78db5-89vt9 Failed to pull image "registry-docker-registry.registry/skaffold-covid-backend:c5dfd81-dirty#sha256:76312ebc62c4b3dd61b4451fe01b1ecd2e6b03a2b3146c7f25df3d3cfb4512cd": rpc error: code = Unknown desc = failed to pull and unpack image "registry-docker-registry.registry/skaffold-covid-backend#sha256:76312ebc62c4b3dd61b4451fe01b1ecd2e6b03a2b3146c7f25df3d3cfb4512cd": failed to resolve reference "registry-docker-registry.registry/skaffold-covid-backend#sha256:76312ebc62c4b3dd61b4451fe01b1ecd2e6b03a2b3146c7f25df3d3cfb4512cd": failed to do request: Head https://registry-docker-registry.registry/v2/skaffold-covid-backend/manifests/sha256:76312ebc62c4b3dd61b4451fe01b1ecd2e6b03a2b3146c7f25df3d3cfb4512cd: dial tcp: lookup registry-docker-registry.registry: Try again
1s Warning Failed pod/covid-backend-774bd78db5-89vt9 Error: ErrImagePull
As you can see, kubernetes is not able to get access to the internal deployed registry...
Any ideas?
I would recommend to follow docs from k3d, they are here.
More precisely this one
Using your own local registry
If you don't want k3d to manage your registry, you can start it with some docker commands, like:
docker volume create local_registry
docker container run -d --name registry.local -v local_registry:/var/lib/registry --restart always -p 5000:5000 registry:2
These commands will start you registry in registry.local:5000. In order to push to this registry, you will need to add the line at /etc/hosts as we described in the previous section . Once your registry is up and running, we will need to add it to your registries.yaml configuration file. Finally, you must connect the registry network to the k3d cluster network: docker network connect k3d-k3s-default registry.local. And then you can check you local registry.
Pushing to your local registry address
The registry will be located, by default, at registry.local:5000 (customizable with the --registry-name and --registry-port parameters). All the nodes in your k3d cluster can resolve this hostname (thanks to the DNS server provided by the Docker daemon) but, in order to be able to push to this registry, this hostname but also be resolved from your host.
The easiest solution for this is to add an entry in your /etc/hosts file like this:
127.0.0.1 registry.local
Once again, this will only work with k3s >= v0.10.0 (see the section below when using k3s <= v0.9.1)
Local registry volume
The local k3d registry uses a volume for storying the images. This volume will be destroyed when the k3d registry is released. In order to persist this volume and make these images survive the removal of the registry, you can specify a volume with the --registry-volume and use the --keep-registry-volume flag when deleting the cluster. This will create a volume with the given name the first time the registry is used, while successive invocations will just mount this existing volume in the k3d registry container.
Docker Hub cache
The local k3d registry can also be used for caching images from the Docker Hub. You can start the registry as a pull-through cache when the cluster is created with --enable-registry-cache. Used in conjuction with --registry-volume/--keep-registry-volume can speed up all the downloads from the Hub by keeping a persistent cache of images in your local machine.
Testing your registry
You should test that you can
push to your registry from your local development machine.
use images from that registry in Deployments in your k3d cluster.
We will verify these two things for a local registry (located at registry.local:5000) running in your development machine. Things would be basically the same for checking an external registry, but some additional configuration could be necessary in your local machine when using an authenticated or secure registry (please refer to Docker's documentation for this).
Firstly, we can download some image (like nginx) and push it to our local registry with:
docker pull nginx:latest
docker tag nginx:latest registry.local:5000/nginx:latest
docker push registry.local:5000/nginx:latest
Then we can deploy a pod referencing this image to your cluster:
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-test-registry
labels:
app: nginx-test-registry
spec:
replicas: 1
selector:
matchLabels:
app: nginx-test-registry
template:
metadata:
labels:
app: nginx-test-registry
spec:
containers:
- name: nginx-test-registry
image: registry.local:5000/nginx:latest
ports:
- containerPort: 80
EOF
Then you should check that the pod is running with kubectl get pods -l "app=nginx-test-registry".
Additionaly there are 2 github links worth visting
K3d not able resolve dns
You could try to use an answer provided by #rjshrjndrn, might solve your issue with dns.
docker images are not pulled from docker repository behind corporate proxy
Open github issue on k3d with same problem as yours.
I am using docker + AKS to manage my containers. When I run my containers locally/or on a VM using docker-compose ..my services(which are containerized) can communicate with my databases which are also in containers. The bridge between these containers is created using networks. After I converted the docker-compose file for all of my applications to the respective yaml counterparts and deployed my containers to AKS (single node), my containerized services are not able to reach the database.
All my containers have 3 yaml files
Pvc
deployment(for pods)
svc.
I've gone through many of the getting started with AKS examples and for some reason am not able to figure it out. All application services are exposed publicly using load balancers. My question is more like how do I define which db the application services should connect to now that the concept of networks doesn't exist anymore.
In the examples provided for KS all the the front end services do is create a env and specify the name of the backend service. I tried that as well and my application still doesn't work. Sample that I referred to validate my setup is https://learn.microsoft.com/en-gb/azure/aks/kubernetes-walkthrough#run-the-application.
Any help would be great.
If you need these services internally only, you should not expose it publicly using load balancers.
Kubernetes has two possibilities for service discovery. DNS and environment variables. While DNS is an optional component, I did not see any cluster without it. Also I assume that AKS uses it.
So, for example you have a Postgres database and want to use it somewhere else:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: postgres
labels:
app: postgres
spec:
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: db
image: postgres:11
ports:
- name: postgres
containerPort: 5432
This creates a deployment with exposes the port 5432. The label app: postgres is also important here, since we need it later to identify the created Pods.
Now we need to create a service for it:
apiVersion: v1
kind: Service
metadata:
name: postgres
labels:
app: postgres
spec:
type: ClusterIP # default value
selector:
app: postgres
ports:
- port: 5432
This creates a virtual IP address and registers all ready pods with the label app: postgres to it. Since the name of the service is postgres and it is the default namespace, postgres is now accessible via postgres.default.svc.cluster.local:5432. You can you this address and port in your other application (eg Python) to connect to the database.
I m trying to have cloudera manager and cloudera agents on openshift, in order to run the installation I need to get all the pods communicating with each other.
Manually, I modified the /etc/hosts on the manager and add all the agents and on the agents I added the manager and all the other agents.
Now I wanted to automate this, let suppose I add a new agent, I want it to resolve the manager and the host (I can get a part of it done, by passing the manager name as an env variable and with a shell script add it to the /etc/hosts, not the ideal way but still solution). But the second part would be more difficult, to get the manager to resolve every new agent, and also to resolve every other agent on the same service.
I was wondering if there is a way so every pod on the cluster can resolve the others names ?
I have to services cloudera-manager with one pod, and an other service cloudera-agent with -let's say- 3 agents.
do you have any idea ?
thank you.
Not sure, but it looks like you could benefit from StatefulSets.
There are other ways to get the other pods ips (like using a headless service or requesting to the serverAPI directly ) but StatefulSets provide :
Stable, unique network identifiers
Stable, persistent storage.
Lots of other functionality that facilitates the deployment of a special kind of clusters like distributed databases. Not sure my term 'distributed' here is correct, but it helps me remind what they are for :).
If you want to get all Pods running under a certain Service, make sure to use a headless Service (i.e. set clusterIP: None). Then, you can query your local DNS-Server for the Service and will receive A-Records for all Pods assigned to it:
---
apiVersion: v1
kind: Service
metadata:
name: my-sv
namespace: my-ns
labels:
app: my-app
spec:
clusterIP: None
selector:
app: my-app
Then start your Pods (make sure to give app: labels for assignment) and query your DNS-Server from any of them:
kubectl exec -ti my-pod --namespace=my-ns -- /bin/bash
$ nslookup my-sv.my-ns.svc.cluster.local
Server: 10.255.3.10
Address: 10.255.3.10#53
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.24.11
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.5.73
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.87.6