Kubernetes requests not balanced - kubernetes

We've just had an increase in traffic to our kubernetes cluster and I've noticed that of our 6 application pods, 2 of them are seemingly not used very much. kubectl top pods returns the following
You can see of the 6 pods, 4 of them are using more than 50% of the CPU (2 vCPU nodes), but two of them aren't really doing much at all.
Our cluster is setup on AWS, using the ALB ingress controller. The load balancer is configured to use the Least outstanding requests rather than Round robin in an attempt to balance things out a bit more, but we're still seeing this imbalance.
Is there any way of determining why this is happening, or if indeed it actually is a problem? I'm hoping it's normal behaviour rather than an issue but I'd rather investigate it.
App deployment config
This is the configuration of the main application pods. Nothing fancy really
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "app.fullname" . }}
labels:
{{- include "app.labels" . | nindent 4 }}
app.kubernetes.io/component: web
spec:
replicas: {{ .Values.app.replicaCount }}
selector:
matchLabels:
app: {{ include "app.fullname" . }}-web
template:
metadata:
annotations:
checksum/config: {{ include (print $.Template.BasePath "/config_maps/app-env-vars.yaml") . | sha256sum }}
{{- with .Values.podAnnotations }}
{{- toYaml . | nindent 10 }}
{{- end }}
labels:
{{- include "app.selectorLabels" . | nindent 8 }}
app.kubernetes.io/component: web
app: {{ include "app.fullname" . }}-web
spec:
serviceAccountName: {{ .Values.serviceAccount.web }}
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- {{ include "app.fullname" . }}-web
topologyKey: failure-domain.beta.kubernetes.io/zone
containers:
- name: {{ .Values.image.name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- bundle
args: ["exec", "puma", "-p", "{{ .Values.app.containerPort }}"]
ports:
- name: http
containerPort: {{ .Values.app.containerPort }}
protocol: TCP
readinessProbe:
httpGet:
path: /healthcheck
port: {{ .Values.app.containerPort }}
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 5
resources:
{{- toYaml .Values.resources | nindent 12 }}
envFrom:
- configMapRef:
name: {{ include "app.fullname" . }}-cm-env-vars
- secretRef:
name: {{ include "app.fullname" . }}-secret-rails-master-key
- secretRef:
name: {{ include "app.fullname" . }}-secret-db-credentials
- secretRef:
name: {{ include "app.fullname" . }}-secret-db-url-app
- secretRef:
name: {{ include "app.fullname" . }}-secret-redis-credentials

That's a known issue with Kubernetes.
In short, Kubernetes doesn't load balance long-lived TCP connections.
This excellent article covers it in details.
The load distribution you service is showing complies exactly with the case.

Related

Kubernetes cluster unable to pull images from DigitalOcean Registry

My DigitalOcean kubernetes cluster is unable to pull images from the DigitalOcean registry. I get the following error message:
Failed to pull image "registry.digitalocean.com/XXXX/php:1.1.39": rpc error: code = Unknown desc = failed to pull and unpack image
"registry.digitalocean.com/XXXXXXX/php:1.1.39": failed to resolve reference
"registry.digitalocean.com/XXXXXXX/php:1.1.39": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized
I have added the kubernetes cluster using DigitalOcean Container Registry Integration, which shows there successfully both on the registry and the settings for the kubernetes cluster.
I can confirm the above address `registry.digitalocean.com/XXXX/php:1.1.39 matches the one in the registry. I wonder if I’m misunderstanding how the token / login integration works with the registry, but I’m under the impression that this was a “one click” thing and that the cluster would automatically get the connection to the registry after that.
I have tried by logging helm into a registry before pushing, but this did not work (and I wouldn't really expect it to, the cluster should be pulling the image).
It's not completely clear to me how the image pull secrets are supposed to be used.
My helm deployment chart is basically the default for API Platform:
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "api-platform.fullname" . }}
labels:
{{- include "api-platform.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "api-platform.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "api-platform.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "api-platform.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}-caddy
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.caddy.image.repository }}:{{ .Values.caddy.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.caddy.image.pullPolicy }}
env:
- name: SERVER_NAME
value: :80
- name: PWA_UPSTREAM
value: {{ include "api-platform.fullname" . }}-pwa:3000
- name: MERCURE_PUBLISHER_JWT_KEY
valueFrom:
secretKeyRef:
name: {{ include "api-platform.fullname" . }}
key: mercure-publisher-jwt-key
- name: MERCURE_SUBSCRIBER_JWT_KEY
valueFrom:
secretKeyRef:
name: {{ include "api-platform.fullname" . }}
key: mercure-subscriber-jwt-key
ports:
- name: http
containerPort: 80
protocol: TCP
- name: admin
containerPort: 2019
protocol: TCP
volumeMounts:
- mountPath: /var/run/php
name: php-socket
#livenessProbe:
# httpGet:
# path: /
# port: admin
#readinessProbe:
# httpGet:
# path: /
# port: admin
resources:
{{- toYaml .Values.resources | nindent 12 }}
- name: {{ .Chart.Name }}-php
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.php.image.repository }}:{{ .Values.php.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.php.image.pullPolicy }}
env:
{{ include "api-platform.env" . | nindent 12 }}
volumeMounts:
- mountPath: /var/run/php
name: php-socket
readinessProbe:
exec:
command:
- docker-healthcheck
initialDelaySeconds: 120
periodSeconds: 3
livenessProbe:
exec:
command:
- docker-healthcheck
initialDelaySeconds: 120
periodSeconds: 3
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:
- name: php-socket
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
How do I authorize the kubernetes cluster to pull from the registry? Is this a helm thing or a kubernetes only thing?
Thanks!
The problem that you have is that you do not have an image pull secret for your cluster to use to pull from the registry.
You will need to add this to give your cluster a way to authorize its requests to the cluster.
Using the DigitalOcean kubernetes Integration for Container Registry
Digital ocean provides a way to add image pull secrets to a kubernetes cluster in your account. You can link the registry to the cluster in the settings of the registry. Under "DigitalOcean Kuberentes Integration" select edit, then select the cluster you want to link the registry to.
This action adds an image pull secret to all namespaces within the cluster and will be used by default (unless you specify otherwise).
The issue was that API Platform automatically has a default value for imagePullSecrets in the helm chart, which is
imagePullSecrets: []
in values.yaml
So this seems to override kubernetes from accessing imagePullSecrets in the way that I expected. The solution was to add the name of the imagePullSecrets directly to the helm deployment command, like this:
--set "imagePullSecrets[0].name=registry-secret-name-goes-here"
You can view the name of your secret using kubectl get secrets like this:
kubectl get secrets
And the output should look something like this:
NAME TYPE DATA AGE
default-token-lz2ck kubernetes.io/service-account-token 3 38d
registry-secret-name-goes-here kubernetes.io/dockerconfigjson 1 2d16h

Kubernetes Issue

I have a Micro service (on Node.js)
I am creating a docker image for it and pushing it to my local registry running at
localhost:5001
While deploying this micro service using helm
helm upgrade --install --wait --set env=dev --set image.tag=localhost:5001/user-service userservice-api ./build/helm --namespace dev --create-namespace --kube-context http://localhost:5001
I get
Error: Kubernetes cluster unreachable: context "http://localhost:5001"
does not exist
How do i find out the issue/resolve it?
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "chart.fullname" . }}
labels:
{{- include "chart.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "chart.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "chart.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: 4006
protocol: TCP
env:
- name: ENV
value: "{{ .Values.env }}"
readinessProbe:
httpGet:
path: /health
port: 4006
initialDelaySeconds: 15
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 4006
initialDelaySeconds: 15
periodSeconds: 10
values.yaml
replicaCount: 1
image:
repository: localhost:5001/user-service
Additional Information
Can someone please help me with the issue?
In Dockerfile
RUN npm ci
had to be changed to
In Dockerfile
RUN npm ci --force
resolved issue for me.

acumos AI clio installation fails with "error converting YAML to JSON"

I have been trying to install clio release.
VM :
ubuntu 18.04
16 Cores
32 GB RAM
500 GB Storage.
Command :
bash /home/ubuntu/system-integration/tools/aio_k8s_deployer/aio_k8s_deployer.sh all acai-server ubuntu generic
All most all steps of installation have completed successfully but during "setup-lum", I got below error.
Error:
YAML parse error on lum-helm/templates/deployment.yaml:
error converting YAML to JSON: yaml: line 36: mapping values are not allowed in this context
Workaround :
I was able to get away with these error(tested via helm install --dry-run ) by
a. removing "resource, affinity and tolerant blocks
b. replace "Release.Name" with actual release value( e.g. license-clio-configmap)
but when I run the full installation command, those helms charts are updated again.
Full error :
...
helm install -f kubernetes/values.yaml --name license-clio --namespace default --debug ./kubernetes/license-usage-manager/lum-helm
[debug] Created tunnel using local port: '46109'
[debug] SERVER: "127.0.0.1:46109"
[debug] Original chart version: ""
[debug] CHART PATH: /deploy/system-integration/AIO/lum/kubernetes/license-usage-manager/lum-helm
YAML parse error on lum-helm/templates/deployment.yaml: error converting YAML to JSON: yaml: line 36: mapping values are not allowed in this context
Yaml of deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ template "lum-helm.fullname" . }}
labels:
app: {{ template "lum-helm.name" . }}
chart: {{ template "lum-helm.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app: {{ template "lum-helm.name" . }}
release: {{ .Release.Name }}
template:
metadata:
labels:
app: {{ template "lum-helm.name" . }}
release: {{ .Release.Name }}
spec:
initContainers:
- name: wait-for-db
image: busybox:1.28
command:
- 'sh'
- '-c'
- >
until nc -z -w 2 {{ .Release.Name }}-postgresql {{ .Values.postgresql.servicePort }} && echo postgresql ok;
do sleep 2;
done
containers:
- name: {{ .Chart.Name }}
image: nexus3.acumos.org:10002/acumos/lum-server:default
imagePullPolicy: {{ .Values.image.pullPolicy }}
env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: {{ .Release.Name }}-postgresql
key: postgresql-password
- name: NODE
volumeMounts:
- name: config-volume
mountPath: /opt/app/lum/etc/config.json
subPath: lum-config.json
ports:
- name: http
containerPort: 2080
protocol: TCP
livenessProbe:
httpGet:
path: '/api/healthcheck'
port: http
initialDelaySeconds: 60
periodSeconds: 10
failureThreshold: 10
readinessProbe:
httpGet:
path: '/api/healthcheck'
port: http
initialDelaySeconds: 60
periodSeconds: 10
failureThreshold: 10
resources:
{{ toYaml .Values.resources | indent 12 }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{ toYaml . | indent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{ toYaml . | indent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{ toYaml . | indent 8 }}
{{- end }}
volumes:
- name: config-volume
configMap:
name: {{ .Release.Name }}-configmap
This error was resolved as per Error trying to install Acumos Clio using AIO
I provided an imagetag:1.3.2 in my actual value.yaml and lum deployment was successful
in acumos setup there are two copied of setup-lum.sh and values.yaml
actual :
~/system-integration/AIO/lum/kubernetes/value.yaml
and run time copy
~/aio_k8s_deployer/deploy/system-integration/AIO/lum/kubernetes/value.yaml
I found this workaround:
Uncommented the IMAGE-TAG line in the values.yaml file
Commented the following lines in the setup-lum.sh file (these were already executed at the first run and in this way I skipped the overwriting problem)
rm -frd kubernetes/license-usage-manager
git clone "https://gerrit.acumos.org/r/license-usage-manager" \
kubernetes/license-usage-manager

Knative service with Keycloak gatekeeper sidecar

I am trying to deploy the following service:
{{- if .Values.knativeDeploy }}
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
{{- if .Values.service.name }}
name: {{ .Values.service.name }}
{{- else }}
name: {{ template "fullname" . }}
{{- end }}
labels:
chart: "{{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}"
spec:
template:
spec:
containers:
- image: quay.io/keycloak/keycloak-gatekeeper:9.0.3
name: gatekeeper-sidecar
ports:
- containerPort: {{ .Values.keycloak.proxyPort }}
env:
- name: KEYCLOAK_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: {{ template "keycloakclient" . }}
key: secret
args:
- --resources=uri=/*
- --discovery-url={{ .Values.keycloak.url }}/auth/realms/{{ .Values.keycloak.realm }}
- --client-id={{ template "keycloakclient" . }}
- --client-secret=$(KEYCLOAK_CLIENT_SECRET)
- --listen=0.0.0.0:{{ .Values.keycloak.proxyPort }} # listen on all interfaces
- --enable-logging=true
- --enable-json-logging=true
- --upstream-url=http://127.0.0.1:{{ .Values.service.internalPort }} # To connect with the main container's port
resources:
{{ toYaml .Values.gatekeeper.resources | indent 12 }}
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
env:
{{- range $pkey, $pval := .Values.env }}
- name: {{ $pkey }}
value: {{ quote $pval }}
{{- end }}
envFrom:
{{ toYaml .Values.envFrom | indent 10 }}
ports:
- containerPort: {{ .Values.service.internalPort }}
livenessProbe:
httpGet:
path: {{ .Values.probePath }}
port: {{ .Values.service.internalPort }}
initialDelaySeconds: {{ .Values.livenessProbe.initialDelaySeconds }}
periodSeconds: {{ .Values.livenessProbe.periodSeconds }}
successThreshold: {{ .Values.livenessProbe.successThreshold }}
timeoutSeconds: {{ .Values.livenessProbe.timeoutSeconds }}
readinessProbe:
httpGet:
path: {{ .Values.probePath }}
port: {{ .Values.service.internalPort }}
periodSeconds: {{ .Values.readinessProbe.periodSeconds }}
successThreshold: {{ .Values.readinessProbe.successThreshold }}
timeoutSeconds: {{ .Values.readinessProbe.timeoutSeconds }}
resources:
{{ toYaml .Values.resources | indent 12 }}
terminationGracePeriodSeconds: {{ .Values.terminationGracePeriodSeconds }}
{{- end }}
Which fails with the following error:
Error from server (BadRequest): error when creating "/tmp/helm-template-workdir-290082188/jx/output/namespaces/jx-staging/env/charts/docs/templates/part0-ksvc.yaml": admission webhook "webhook.serving.knative.dev" denied the request: mutation failed: expected exactly one, got both: spec.template.spec.containers'
Now, if I read the specs (https://knative.dev/v0.15-docs/serving/getting-started-knative-app/), I can see this example:
apiVersion: serving.knative.dev/v1 # Current version of Knative
kind: Service
metadata:
name: helloworld-go # The name of the app
namespace: default # The namespace the app will use
spec:
template:
spec:
containers:
- image: gcr.io/knative-samples/helloworld-go # The URL to the image of the app
env:
- name: TARGET # The environment variable printed out by the sample app
value: "Go Sample v1"
Which has exactly the same structure. Now, my questions are:
How can I validate my yam without waiting for a deployment? Intellij has a k8n plugin, but I can't find the CRD schema for serving.knative.dev/v1 that are machine consumable. (https://knative.dev/docs/serving/spec/knative-api-specification-1.0/)
Is it allowed with knative to have multiple container? (that configuration works perfectly with apiVersion: apps/v1 kind: Deployment)
Multi container is alpha feature in knative version 0.16.
This feature need to be enabled by setting multi-container to enabled in the config-features ConfigMap. So edit the configmap using
kubectl edit cm config-features and enable that feature.
apiVersion: v1
kind: ConfigMap
metadata:
name: config-features
namespace: knative-serving
labels:
serving.knative.dev/release: devel
annotations:
knative.dev/example-checksum: "983ddf13"
data:
_example: |
...
# Indicates whether multi container support is enabled
multi-container: "enabled"
...
What version of Knative are you using?
Support for multiple containers was added as an alpha feature in 0.16. If you're not using 0.16 or later or don't have the alpha flag enabled, the request will probably be blocked.
There were a number of edge cases to define for multi-container support in Knative, so the default was to be conservative and only allow one container until the constraints had been explored.

Is it allowed to have multi service in helm chart?

I am pretty newbie in Helm and would like to know, if it is allowed to have multi services in service.yaml file like:
apiVersion: v1
kind: Service
metadata:
name: {{ include "keycloak.fullname" . }}
labels:
{{- include "keycloak.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "keycloak.selectorLabels" . | nindent 4 }}
---
apiVersion: v1
kind: Service
metadata:
name: {{ include "keycloak.fullname" . }}
labels:
{{- include "keycloak.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "keycloak.selectorLabels" . | nindent 4 }}
Yes it is, are you facing any issue?
A cleaner way is to use two different files service-a.yaml and service-b.yaml
Note: Better not to have both the services with the same name.