Pod load distribution in Kubernetes - kubernetes

I have a service in Kubernetes that receives Http requests to create users,
Only with 1 pod, it correctly reaches 100 requests per minute, after this, it has latencies, the point is that if you hold 100 requests with 1 pod, should you keep 500 requests per minute with 5 pods?
Because even with 10 pods, when exceeding 100 orders per minute, dont correctly distributed the load and appears latency in the services.
The default load configuration I understand is round robin, the problem is that I see that the ram increases only in one of the pods and does not distribute the load correctly.
This is my service yaml deploy and my HPA yaml.
Deploy Yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: create-user-service
labels:
app: create-user-service
spec:
replicas: 1
selector:
matchLabels:
app: create-user-service
template:
metadata:
labels:
app: create-user-service
spec:
volumes:
- name: key
secret:
secretName: my-secret-key
containers:
### [LISTPARTY CONTAINER]
- name: create-user-service
image: docker/user-create/create-user-service:0.0.1
volumeMounts:
- name: key
mountPath: /var/secrets/key
ports:
- containerPort: 8080
env:
- name: PORT
value: "8080"
resources:
limits:
cpu: "2.5"
memory: 6Gi
requests:
cpu: "1.5"
memory: 5Gi
livenessProbe: ## is healthy
failureThreshold: 3
httpGet:
path: /healthcheck/livenessprobe
port: 8080
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: create-user-service
spec:
ports:
- port: 8080
targetPort: 8080
protocol: TCP
name: http
selector:
app: create-user-service
type: NodePort
HPA Yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: create-user-service
spec:
maxReplicas: 10
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: create-user-service
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 75
- type: Resource
resource:
name: memory
targetAverageUtilization: 75
- external:
metricName: serviceruntime.googleapis.com|api|request_count
metricSelector:
matchLabels:
resource.type: api
resource.labels.service: create-user-service.endpoints.user-create.cloud.goog
targetAverageValue: "3"
type: External
What may be happend?.
Thank you all.

Related

GKE "no.scale.down.node.pod.not.enough.pdb" log even with existing PDB

My GKE cluster is displaying "Scale down blocked by pod" note, and clicking it then going to the Logs Explorer it shows a filtered view with log entries for the pods that had the incident: no.scale.down.node.pod.not.enough.pdb . But that's really strange since the pods on the log entries having that message do have PDB defined for them. So it seems to me that GKE is wrongly reporting the cause of the blocking of the node scale down. These are the manifests for one of the pods with this issue:
apiVersion: v1
kind: Service
metadata:
labels:
app: ms-new-api-beta
name: ms-new-api-beta
namespace: beta
spec:
ports:
- port: 8000
protocol: TCP
targetPort: 8000
selector:
app: ms-new-api-beta
type: NodePort
The Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: ms-new-api-beta
name: ms-new-api-beta
namespace: beta
spec:
selector:
matchLabels:
app: ms-new-api-beta
template:
metadata:
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: 'true'
labels:
app: ms-new-api-beta
spec:
containers:
- command:
- /deploy/venv/bin/gunicorn
- '--bind'
- '0.0.0.0:8000'
- 'newapi.app:app'
- '--chdir'
- /deploy/app
- '--timeout'
- '7200'
- '--workers'
- '1'
- '--worker-class'
- uvicorn.workers.UvicornWorker
- '--log-level'
- DEBUG
env:
- name: ENV
value: BETA
image: >-
gcr.io/.../api:${trigger['tag']}
imagePullPolicy: Always
livenessProbe:
failureThreshold: 5
httpGet:
path: /rest
port: 8000
scheme: HTTP
initialDelaySeconds: 120
periodSeconds: 20
timeoutSeconds: 30
name: ms-new-api-beta
ports:
- containerPort: 8000
name: http
protocol: TCP
readinessProbe:
httpGet:
path: /rest
port: 8000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 2
resources:
limits:
cpu: 150m
requests:
cpu: 100m
startupProbe:
failureThreshold: 30
httpGet:
path: /rest
port: 8000
periodSeconds: 120
imagePullSecrets:
- name: gcp-docker-registry
The Horizontal Pod Autoscaler:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: ms-new-api-beta
namespace: beta
spec:
maxReplicas: 5
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ms-new-api-beta
targetCPUUtilizationPercentage: 100
And finally, the Pod Disruption Budget:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: ms-new-api-beta
namespace: beta
spec:
minAvailable: 0
selector:
matchLabels:
app: ms-new-api-beta
no.scale.down.node.pod.not.enough.pdb is not complaining about the lack of a PDB. It is complaining that, if the pod is scaled down, it will be in violation of the existing PDB(s).
The "budget" is how much disruption the Pod can permit. The platform will not take any intentional action which violates that budget.
There may be another PDB in place that would be violated. To check, make sure to review pdbs in the pod's namespace:
kubectl get pdb

How is it that my k8s service is accessible without explicitly exposing a port?

I am running a http server in k8s, I am able to access the server via it's svc name from another service by making a call to https://e2e-test-runner.hogwarts.svc.cluster.local However, I haven't explicitly configured service port. How does this work?
Does k8s map container port as service port when no service port is present?
This is my service definition:
apiVersion: v1
kind: Service
metadata:
name: e2e-test-runner
namespace: hogwarts
spec:
selector:
app: e2e-test-runner
This is my deployment definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: e2e-test-runner
namespace: hogwarts
spec:
strategy:
rollingUpdate:
maxSurge: 3
maxUnavailable: 0
revisionHistoryLimit: 3
selector:
matchLabels:
app: e2e-test-runner
template:
metadata:
labels:
app: e2e-test-runner
app.kubernetes.io/name: e2e-test-runner
spec:
containers:
- name: app
image: $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/e2e-test-runner:$IMAGE_VERSION
ports:
- containerPort: 3000
resources:
requests:
memory: "250Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health_check
port: 3000
scheme: "HTTPS"
initialDelaySeconds: 20
periodSeconds: 10
timeoutSeconds: 2
readinessProbe:
httpGet:
path: /health_check
port: 3000
scheme: "HTTPS"
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
# Gives pod 15 seconds to complete any outstanding requests before being force killed
terminationGracePeriodSeconds: 15

Cannot add linkerd inject annotation in deployment file

I have a kubernetes deployment file user.yaml -
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-deployment
namespace: stage
spec:
replicas: 1
selector:
matchLabels:
app: user
template:
metadata:
labels:
app: user
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9022'
spec:
nodeSelector:
env: stage
containers:
- name: user
image: <docker image path>
imagePullPolicy: Always
resources:
limits:
memory: "512Mi"
cpu: "250m"
requests:
memory: "256Mi"
cpu: "200m"
ports:
- containerPort: 8080
env:
- name: MODE
value: "local"
- name: PORT
value: ":8080"
- name: REDIS_HOST
value: "xxx"
- name: KAFKA_ENABLED
value: "true"
- name: BROKERS
value: "xxx"
imagePullSecrets:
- name: regcred
---
apiVersion: v1
kind: Service
metadata:
namespace: stage
name: user
spec:
selector:
app: user
ports:
- protocol: TCP
port: 8080
targetPort: 8080
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: user
namespace: stage
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: user-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 300Mi
This deployment is already running with linkerd injected with command cat user.yaml | linkerd inject - | kubectl apply -f -
Now I wanted to add linkerd inject annotation (as mentioned here) and use command kubectl apply -f user.yaml just like I use for a deployment without linkerd injected.
However, with modified user.yaml (after adding linkerd.io/inject annotation in deployment) -
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-deployment
namespace: stage
spec:
replicas: 1
selector:
matchLabels:
app: user
template:
metadata:
labels:
app: user
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9022'
linkerd.io/inject: enabled
spec:
nodeSelector:
env: stage
containers:
- name: user
image: <docker image path>
imagePullPolicy: Always
resources:
limits:
memory: "512Mi"
cpu: "250m"
requests:
memory: "256Mi"
cpu: "200m"
ports:
- containerPort: 8080
env:
- name: MODE
value: "local"
- name: PORT
value: ":8080"
- name: REDIS_HOST
value: "xxx"
- name: KAFKA_ENABLED
value: "true"
- name: BROKERS
value: "xxx"
imagePullSecrets:
- name: regcred
---
apiVersion: v1
kind: Service
metadata:
namespace: stage
name: user
spec:
selector:
app: user
ports:
- protocol: TCP
port: 8080
targetPort: 8080
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: user
namespace: stage
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: user-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 300Mi
When I run kubectl apply -f user.yaml, it throws error -
service/user unchanged
horizontalpodautoscaler.autoscaling/user configured
Error from server (BadRequest): error when creating "user.yaml": Deployment in version "v1" cannot be handled as a Deployment: v1.Deployment.Spec: v1.DeploymentSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.v1.EnvVar.Value: ReadString: expects " or n, but found 1, error found in #10 byte of ...|,"value":1},{"name":|..., bigger context ...|ue":":8080"}
Can anyone please point out where I have gone wrong in adding annotation?
Thanks
Try with double quotes like below
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9022"
linkerd.io/inject: enabled

need to apply the prometheus rules from alertmanager repo in k8s

Have 2 gitlab repos:
=> gitlab a
=> gitlab b
gitlab a - contains stateful set and pod of prometheus and prometheus pushgateway
gitlab b - conatins the alertmanager service and alermanager pod and prometheus rules.
all the pods and containers are up and running.
am trying to apply the prometheus-rules to the prometheus stateful set.
prometheusRule.png
need to apply the Kind:prometheus rule to stateful set of prometheus.
can someone help.
applied rules yaml :
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: prometheus-k8s-rules
namespace: cmp-monitoring
spec:
groups:
- name: node-exporter.rules
rules:
- expr: |
count without (cpu) (
count without (mode) (
node_cpu_seconds_total{job="node-exporter"}
)
)
record: instance:node_num_cpu:sum
- expr: |
1 - avg without (cpu, mode) (
rate(node_cpu_seconds_total{job="node-exporter", mode="idle"}[1m])
)
record: instance:node_cpu_utilisation:rate1m
prometheus-statefulset
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus
labels:
app: prometheus
spec:
selector:
matchLabels:
app: prometheus
serviceName: prometheus
replicas: 1
template:
metadata:
labels:
app: prometheus
spec:
terminationGracePeriodSeconds: 10
containers:
- name: prometheus
image: prom/prometheus
imagePullPolicy: Always
ports:
- name: http
containerPort: 9090
volumeMounts:
- name: prometheus-config
mountPath: "/etc/prometheus/prometheus.yml"
subPath: prometheus.yml
- name: prometheus-data
mountPath: "/prometheus"
#- name: rules-general
# mountPath: "/etc/prometheus/prometheus.rules.yml"
# subPath: prometheus.rules.yml
livenessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 120
periodSeconds: 40
successThreshold: 1
timeoutSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /-/healthy
port: 9090
initialDelaySeconds: 120
periodSeconds: 40
successThreshold: 1
timeoutSeconds: 10
failureThreshold: 3
securityContext:
fsGroup: 1000
volumes:
- name: prometheus-config
configMap:
name: prometheus-server-conf
#- name: rules-general
# configMap:
# name: prometheus-server-conf
volumeClaimTemplates:
- metadata:
name: prometheus-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: rbd-default
resources:
requests:
storage: 10Gi

Accessing kubernetes headless service over ambassador

I have deployed my service as headless server and did follow the kubernetes configuration as mentioned in this link (http://vertx.io/docs/vertx-hazelcast/java/#_using_this_cluster_manager). My service is load balanced and proxied using ambassador. Everything was working fine as long as the service was not headless. Once the service changed to headless, ambassador is not able to discover my services. Which means it was looking for clusterIP and it is missing now as the services are headless. What is that I need to include in my deployment.yaml so these services are discovered by ambassador.
Error I see " upstream connect error or disconnect/reset before headers. reset reason: connection failure"
I need these services to be headless because that is the only way to create a cluster using hazelcast. And I am creating web socket connection and vertx eventbus.
apiVersion: v1
kind: Service
metadata:
name: abt-login-service
labels:
chart: "abt-login-service-0.1.0-SNAPSHOT"
annotations:
fabric8.io/expose: "true"
fabric8.io/ingress.annotations: 'kubernetes.io/ingress.class: nginx'
getambassador.io/config: |
---
apiVersion: ambassador/v1
name: login_mapping
ambassador_id: default
kind: Mapping
prefix: /login/
service: abt-login-service.default.svc.cluster.local
use_websocket: true
spec:
type: ClusterIP
clusterIP: None
selector:
app: RELEASE-NAME-abt-login-service
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
- name: hz-port-name
port: 5701
protocol: TCP```
```Deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: RELEASE-NAME-abt-login-service
labels:
draft: draft-app
chart: "abt-login-service-0.1.0-SNAPSHOT"
spec:
replicas: 2
selector:
matchLabels:
app: RELEASE-NAME-abt-login-service
minReadySeconds: 30
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
draft: draft-app
app: RELEASE-NAME-abt-login-service
component: abt-login-service
spec:
serviceAccountName: vault-auth
containers:
- name: abt-login-service
env:
- name: SPRING_PROFILES_ACTIVE
value: "dev"
- name: _JAVA_OPTIONS
value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=2 -Dsun.zip.disableMemoryMapping=true -XX:+UseParallelGC -XX:Min
HeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -Dhazelcast.diagnostics.enabled=true
"
image: "draft:dev"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
ports:
- containerPort: 5701
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
httpGet:
path: /health
port: 8080
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 500m
memory: 1024Mi
requests:
cpu: 400m
memory: 512Mi
terminationGracePeriodSeconds: 10```
How can I make these services discoverable by ambassador?