How to set Service Load Balancer request timeouts on GKE - kubernetes

I have a Service on GKE of type LoadBalancer that points to a GKE deployment running nginx. My nginx has all of the timeouts set to 10 minutes, yet HTTP/HTTPS requests that have to wait on processing before receiving a response get cutoff with 500 errors after 30 seconds. My settings:
http {
proxy_read_timeout 600s;
proxy_connect_timeout 600s;
keepalive_timeout 600s;
send_timeout 600s;
}
Apparently there are default settings of 30 seconds in the LoadBalancer somewhere.
After pouring through documentation, I've only found a step-through at Google that outlines setting an Ingress with back-end service Load Balancer with a timeout but can't find how to do that on a Service that's Type=LoadBalancer for use with GKE. I've also reviewed all of the Kubernetes documentation for versions 1.7+ (we're on 1.8.7-gke.1) and nothing about setting a timeout. Is there a setting I can add to my yaml file to do this?
If it helps I found the following for AWS, which appears to be what I would need to have on GKE:
annotations:
service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"

As of April 2021 you can do this via GKE/GCE configuration. Here are the instructions.
Essentially you create a BackendConfig resource similar to this:
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: my-backendconfig
spec:
timeoutSec: 40
connectionDraining:
drainingTimeoutSec: 60
(kubectl apply -f my-backendconfig.yaml)
and then connect it to your GKE service resource with an annotation:
apiVersion: v1
kind: Service
metadata:
name: my-service
labels:
purpose: bsc-config-demo
annotations:
cloud.google.com/backend-config: '{"ports": {"80":"my-backendconfig"}}'
cloud.google.com/neg: '{"ingress": true}'
spec:
type: ClusterIP
selector:
purpose: bsc-config-demo
ports:
- port: 80
protocol: TCP
targetPort: 8080
(kubectl apply -f my-service.yaml)
If you prefer, the BackendConfig resource (and Service) can be placed in a namespace with a metadata namespace designation in your yaml.
metadata:
namespace: my-namespace

So far you cannot do it from the YAML file.
There is a open feature request at the moment that I advise you to subscribe and to follow:
https://github.com/kubernetes/ingress-gce/issues/28
They were already discussing regarding this change in 2016: issue.
"Specific use case: GCE backends are provisioned with a default timeout of 30 seconds, which is not sufficient for some long requests. I'd like to be able to control the timeout per-backend."
However I would suggest you to check this part of the Google Cloud Documentation talking specifically regarding configurable response timeout.
UPDATE
Check the issue because they are making progresses
I see there was a v1.0.0 release 18 days ago. Was this the completion of the major refactoring you talked about #nicksardo ?
Is it possible yet to configure how long a connection can be idle before the LB closes it?
UPDATE
The issue mentioned above is now closed and documentation for setting the timeout (and other backend service settings) is available here:
https://cloud.google.com/kubernetes-engine/docs/how-to/configure-backend-service

Related

HAProxy Ingress Controller Service Changed IP on GCP

I am using HAProxy as the ingress-controller in my GKE clusters. And exposing HAProxy service as LoadBalancer service(Internal).
Recently, I experienced an issue, where the HA-Proxy service changed its EXTERNAL-IP, and traffic stopped routing to HAProxy. This issue occurred multiple times on different days(now it has stopped). I had to manually add that new External-IP to the frontend of that Loadbalancer to allow traffic to HAProxy.
There were two pods running for HAProxy, and both had been running for days, and there was nothing in their logs. I assume it was something related to Service or GCP LB and not HAProxy itself.
I am afraid that I don't have any logs related to that.
I still don't know, what caused the service IP to change. As there were no recent changes, and the cluster and all services were running for many days properly, and suddenly this occurred.
Has anyone faced a similar issue earlier? Or what can I do to avoid such issue in future?
What could have caused the IP to change?
This is how my service is configured:
---
apiVersion: v1
kind: Service
metadata:
labels:
run: haproxy-ingress
name: haproxy-ingress
namespace: haproxy-controller
annotations:
cloud.google.com/load-balancer-type: "Internal"
networking.gke.io/internal-load-balancer-allow-global-access: "true"
cloud.google.com/network-tier: "Premium"
spec:
selector:
run: haproxy-ingress
type: LoadBalancer
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
- name: https
port: 443
protocol: TCP
targetPort: 443
- name: stat
port: 1024
protocol: TCP
targetPort: 1024
Found some logs:
Warning SyncLoadBalancerFailed 30m (x3570 over 13d) service-controller Error syncing load balancer: failed to ensure load balancer: googleapi: Error 409: IP_IN_USE_BY_ANOTHER_RESOURCE - IP '10.17.129.17' is already being used by another resource.
Normal EnsuringLoadBalancer 3m33s (x3576 over 13d) service-controller Ensuring load balancer
The Short answer is: External IP for the service are ephemeral.
Because HA-Proxy controller pods are recreated the HA-Proxy service is created with an ephemeral IP.
To avoid this issue, I would recommend using a static IP that you can reference in the loadBalancerIP field.
This can be done by following steps:
Reserve a static IP. (link)
Use this IP, to create a service (link)
Example YAML:
apiVersion: v1
kind: Service
metadata:
name: helloweb
labels:
app: hello
spec:
selector:
app: hello
tier: web
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
loadBalancerIP: "YOUR.IP.ADDRESS.HERE"
Unfortunately without logs it's hard to say anything for sure. You should check the audit logs that GKE ships to Cloud Logging as that might give you some idea of what happened. One option is the GCP "oops"'d the GLB and GKE recreated it, thus giving it a new IP. I've never heard of that happening with LBs though (it happens pretty often with nodes, but not LBs). A more common case would be you ran some kubectl command that inadvertently removed the Service object and then it was recreated by some management layer you have set up (Argo, Flux, Helm Operator, whatever) but delete+recreate again means it's a new LB with a new IP. The latter case should be visible in the audit logs so check those out for sure.

Default Load Balancing in Kubernetes

I've recently started working with Kubernetes clusters. The flow of network calls for a given Kubernetes service in our cluster is something like the following:
External Non-K8S Load Balancer -> Ingress Controller -> Ingress Resource -> Service -> Pod
For a given service, there are two replicas. By looking at the logs of the containers in the replicas, I can see that calls are being routed to different pods. As far as I can see, we haven't explicitly set up any load-balancing policies anywhere for our services in Kubernetes.
I've got a few questions:
1) Is there a default load-balancing policy for K8S? I've read about kube-proxy and random routing. It definitely doesn't appear to be round-robin.
2) Is there an obvious way to specify load balancing rules in the Ingress resources themselves? On a per-service basis?
Looking at one of our Ingress resources, I can see that the 'loadBalancer' property is empty:
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
ingress.kubernetes.io/rewrite-target: /
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"ingress.kubernetes.io/rewrite-target":"/","nginx.ingress.kubernetes.io/rewrite-target":"/"},"name":"example-service-ingress","namespace":"member"},"spec":{"rules":[{"host":"example-service.x.x.x.example.com","http":{"paths":[{"backend":{"serviceName":"example-service-service","servicePort":8080},"path":""}]}}]}}
nginx.ingress.kubernetes.io/rewrite-target: /
creationTimestamp: "2019-02-13T17:49:29Z"
generation: 1
name: example-service-ingress
namespace: x
resourceVersion: "59178"
selfLink: /apis/extensions/v1beta1/namespaces/x/ingresses/example-service-ingress
uid: b61decda-2fb7-11e9-935b-02e6ca1a54ae
spec:
rules:
- host: example-service.x.x.x.example.com
http:
paths:
- backend:
serviceName: example-service-service
servicePort: 8080
status:
loadBalancer:
ingress:
- {}
I should specify - we're using an on-prem Kubernetes cluster, rather than on the cloud.
Cheers!
The "internal load balancing" between Pods of a Service has already been covered in this question from a few days ago.
Ingress isn't really doing anything special (unless you've been hacking in the NGINX config it uses) - it will use the same Service rules as in the linked question.
If you want or need fine-grained control of how pods are routed to within a service, it is possible to extend Kubernetes' features - I recommend you look into the traffic management features of Istio, as one of its features is to be able to dynamically control how much traffic different pods in a service receive.
I see two options that can be used with k8s:
Use istio's traffic management and create a DestinationRule. It currently supports three load balancing modes:
Round robin
Random
Weighted least request
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
...
spec:
...
subsets:
- name: test
...
trafficPolicy:
loadBalancer:
simple: ROUND_ROBIN
Use lb_type in envoy proxy with ambassador on k8s. More info about ambassador is in https://www.getambassador.io.

GKE Load Balancer - Ingress - Service - Session Affinity (Sticky Session)

I had sticky session working in my dev environment with minibike with following configurations:
Ingress:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: gl-ingress
annotations:
nginx.ingress.kubernetes.io/affinity: cookie
kubernetes.io/ingress.class: "gce"
kubernetes.io/ingress.global-static-ip-name: "projects/oceanic-isotope-199421/global/addresses/web-static-ip"
spec:
backend:
serviceName: gl-ui-service
servicePort: 80
rules:
- http:
paths:
- path: /api/*
backend:
serviceName: gl-api-service
servicePort: 8080
Service:
apiVersion: v1
kind: Service
metadata:
name: gl-api-service
labels:
app: gl-api
annotations:
ingress.kubernetes.io/affinity: 'cookie'
spec:
type: NodePort
ports:
- port: 8080
protocol: TCP
selector:
app: gl-api
Now that I have deployed my project to GKE sticky session no longer function. I believe the reason is that the Global Load Balancer configured in GKE does not have session affinity with the NGINX Ingress controller. Anyone have any luck wiring this up? Any help would be appreciated. I wanting to establish session affinity: Client Browser > Load Balancer > Ingress > Service. The actual session lives in the pods behind the service. Its an API Gateway (built with Zuul).
Session affinity is not available yet in the GCE/GKE Ingress controller.
In the meantime and as workaround, you can use the GCE API directly to create the HTTP load balancer. Note that you can't use Ingress at the same time in the same cluster.
Use NodePort for the Kubernetes Service. Set the value of the port in spec.ports[*].nodePort, otherwise a random one will be assigned
Disable kube-proxy SNAT load balancing
Create a Load Balancer from the GCE API, with cookie session affinity enabled. As backend use the port from 1.
Good news! Finally they have support for these kind of tweaks as beta features!
Beginning with GKE version 1.11.3-gke.18, you can use an Ingress to configure these properties of a backend service:
Timeout
Connection draining timeout
Session affinity
The configuration information for a backend service is held in a custom resource named BackendConfig, that you can "attach" to a Kubernetes Service.
Together with other sweet beta-features (like CDN, Armor, etc...) you can find how-to guides here:
https://cloud.google.com/kubernetes-engine/docs/how-to/configure-backend-service
Based on this: https://github.com/kubernetes/ingress-gce/blob/master/docs/annotations.md
there's no annotation available, which could effect the session affinity setting of the Google Cloud LoadBalancer (GCLB), that is created as a result of the ingress creation. As such:
This have to be turned on by hand: either as suggested above by creating the LB yourself, or letting the ingress controller do so and then changing the backend configuration for each backend (either via GUI or gcloud cli). IMHO the later seems faster and less prone to errors. (Tested, and cookie "GCLB" was returned by LB after the config change got propagated automatically, and subsequent requests including the cookie were routed to the same node)
As rightfully pointed out by Matt-y-er: service.spec "externalTrafficPolicy" has to be set to local "Local" to disable forwarding from the Node the GCLB selected to another. However:
One would still need to ensure:
The GCLB should not send traffic to nodes, which doesn't run the pod or
make sure there's a pod running on all nodes (and only a single pod as the externalTrafficPolicy setting would not prevent loadbalancing over multiple local pods)
With regard to #3,the simple solution:
convert the deployment to a daemonset -> there will be exactly one pod on each node (https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) at all times
The more complicated solution (but which allows to have less pods than nodes):
It seems, that GCLB's health check doesn't need to be adjusted as Ingress rule definition automatically sets up a healthcheck to the backend (and not to the default healthz service)
supply anti-affinity rules to make sure there's at most a single instance of a pod on each node (https://kubernetes.io/docs/concepts/configuration/assign-pod-node/)
Note: The above anti-affinity version was tested on 24th July 2018 with 1.10.4-gke.2 kubernetes version on a 2 node cluster running COS (default GKE VM image)
I was trying the gke tutorial for that on version: 1.11.6-gke.6 (the latest availiable).
stickiness was not there... the only option that was working was only after sessing externalTrafficPolicy":"Local" on the service...
spec:
type: NodePort
externalTrafficPolicy: Local
i opened defect to google about the same, and they accepted it, without commiting on eta.
https://issuetracker.google.com/issues/124064870
For the BackendConfig of the ingress loadbalancer, documentation can be found here:
https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-features
An example snippet for type generated cookie is :
spec:
timeoutSec: 1800
connectionDraining:
drainingTimeoutSec: 1800
sessionAffinity:
affinityType: "GENERATED_COOKIE"
affinityCookieTtlSec: 1800

Auto-creating A records with kubernetes services

I've got a kubernetes 1.6.2 cluster, and am creating a service like:
kind: Service
apiVersion: v1
metadata:
name: hello
namespace: myns
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
dns.alpha.kubernetes.io/internal: mydomain.com
spec:
selector:
app: hello-world
ports:
- protocol: "TCP"
port: 80
targetPort: 5000
type: LoadBalancer
I'd expect this to create an internal ELB (which it does) but also set up an A record on the AWS Route53 hosted zone for mydomain.com as per https://github.com/kubernetes/kops/tree/master/dns-controller (which it doesn't). Is there something I need to do to enable A record creation?
In route 53 create a type A record with Alias Yes prior to launching your cluster and Kurbernetes will audio update it with proper Alias Target which gets resolved to the correct IP upon app boot up when you issue
kubectl expose rs .....
This might not work since dns.alpha.kubernetes.io/internal requires that you use NodePort, at least that's what's written in here https://github.com/kubernetes/kops/issues/1082
Update:
There's an open issue about this A-record. CNAME record can be created but A record is not yet working. Forgot the issue number but I think you can find it from kops github issues.

How to configure Ingress request timeouts on GKE

I currently have an Ingress configured on GKE (k8s 1.2) to forward requests towards my application's pods. I have a request which can take a long time (30 seconds) and timeout from my application (504). I observe that when doing so the response that i receive is not my own 504 but a 502 from what looks like the Google Loadbalancer after 60 seconds.
I have played around with different status codes and durations, exactly after 30 seconds i start receiving this weird behaviour regardless of statuscode emitted.
Anybody have a clue how i can fix this? Is there a way to reconfigure this behaviour?
Beginning with 1.11.3-gke.18, it is possible to configure timeout settings in kubernetes directly.
First add a backendConfig:
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: my-bsc-backendconfig
spec:
timeoutSec: 40
Then add an annotation in Service to use this backendConfig:
apiVersion: v1
kind: Service
metadata:
name: my-bsc-service
labels:
purpose: bsc-config-demo
annotations:
beta.cloud.google.com/backend-config: '{"ports": {"80":"my-bsc-backendconfig"}}'
spec:
type: NodePort
selector:
purpose: bsc-config-demo
ports:
- port: 80
protocol: TCP
targetPort: 8080
And viola, your ingress load balancer now has a timeout of 40 second instead of the default 30 seconds.
See https://cloud.google.com/kubernetes-engine/docs/how-to/configure-backend-service#creating_a_backendconfig
When creating an ingress on GKE the default setup is that a GLBC HTTP load balancer will be created with the backends that you supplied. Default it is configured at a 30 second timeout for your application to handle the request.
If you need a longer timeout you have to edit this manually after setup in the backends of your HTTP Load balancer in the google cloud console.