Conditional reverse-proxy in kubernetes based on server response - kubernetes

TLDR: I am looking for a solution that would allow me to proxy traffic between two different Kubernetes services, based on their response.
Background:
I have an exiting application hosted on Kubernetes. Recently I started re-writing one of my microservices in order to speed it up and add few new features. I want to allow my users to decide whether they want to start using this new service or stick to the old one (since some features have breaking changes for their use-case).
Since users usually reach this microservice using address like username.given-microservice.example.com, my initial plan was to set up some proxy between these services, which could ask one of my endpoints using query like:
http://my-new-service.example.com/enabled-for-client?=username
if it returned code 200, then the client would be forwarded to the new service
if the response code was anything else, then the client would be forwarded to the old service.
Of course, the response from the URI above would depend on user settings.
This scenario is very similar to A/B testing, but I do not know and have trouble finding any way to set up proxy based on URL response.
I would highly appreciate any suggestions, blog posts, or links to documentation that could help me solve my scenario - at the moment I ran out of ideas and I kind of feel stuck.

Envoy can manage such scenario, start by looking to HTTP routing. If you cannot found what you're looking for, you can always write filter/routing rules in Lua.

It's possible to achieve this by using NGINX Ingress with custom-http-errors and default-backend annotations.
I created a lab to prove the concept. Let's dive into it together.
First of all you have to install NGINX Ingress in your cluster. If you don't have it, follow the installation guide.
On this POC we will deploy 2 different applications. One is called old-http-backend and it serves a default nginx landing page. The second is called new-http-backend and it serves a echo-server landing page.
apiVersion: apps/v1
kind: Deployment
metadata:
name: old-http-backend
spec:
selector:
matchLabels:
app: old-http-backend
template:
metadata:
labels:
app: old-http-backend
spec:
containers:
- name: old-http-backend
image: nginx
ports:
- name: http
containerPort: 80
imagePullPolicy: IfNotPresent
---
apiVersion: v1
kind: Service
metadata:
name: old-http-backend
spec:
selector:
app: old-http-backend
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: new-http-backend
spec:
selector:
matchLabels:
app: new-http-backend
template:
metadata:
labels:
app: new-http-backend
spec:
containers:
- name: new-http-backend
image: inanimate/echo-server
ports:
- name: http
containerPort: 8080
imagePullPolicy: IfNotPresent
---
apiVersion: v1
kind: Service
metadata:
name: new-http-backend
spec:
selector:
app: new-http-backend
ports:
- protocol: TCP
port: 80
targetPort: 8080
After applying this manifest we have the following deployments and services:
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
new-http-backend 1/1 1 1 2s
old-http-backend 1/1 1 1 43m
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.31.240.1 <none> 443/TCP 152m
new-http-backend ClusterIP 10.31.240.168 <none> 80/TCP 44m
old-http-backend ClusterIP 10.31.242.175 <none> 80/TCP 44m
And now we can apply our Ingress that will be responsible for doing all the magic for us:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: my-app-ingress
namespace: default
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: "/"
nginx.ingress.kubernetes.io/custom-http-errors: '403,404,500,502,503,504'
nginx.ingress.kubernetes.io/default-backend: old-http-backend
spec:
rules:
- host: app.company.com
http:
paths:
- path: "/"
backend:
serviceName: new-http-backend
servicePort: 80
What this ingress is doing?
In the Custom HTTP Errors documentation we cant read that if a default backend annotation is specified on the ingress, the errors will be routed to that annotation's default backend service (instead of the global default backend).
So by inserting these annotations in our ingress rule we are saying that all requests should go to new-http-backend unless it receives a return code listed on custom-http-errors. If that happens, the user will be redirected to old-http-backend as it's specified on default-backend annotation.
nginx.ingress.kubernetes.io/custom-http-errors: '403,404,500,502,503,504'
nginx.ingress.kubernetes.io/default-backend: old-http-backend

Related

Kubernetes, Loadbalancing, and Nginx Ingress - AKS

Stack:
Azure Kubernetes Service
NGINX Ingress Controller - https://github.com/kubernetes/ingress-nginx
AKS Loadbalancer
Docker containers
My goal is to create a K8s cluster that will allow me to use multiple pods, under a single IP, to create a microservice architecture. After working with tons of tutorials and documentation, I'm not having any luck with my endgoal. I got to the point of being able to access a single deployment using the Loadbalancer, but introducing the ingress has not been successful so far. The services are separated into their respective files for readability and ease of control.
Additionally, the Ingress Controller was added to my cluster as described in the installation instructions using: kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.35.0/deploy/static/provider/cloud/deploy.yaml
LoadBalancer.yml:
apiVersion: v1
kind: Service
metadata:
name: backend
spec:
loadBalancerIP: x.x.x.x
selector:
app: ingress-service
tier: backend
ports:
- name: "default"
port: 80
targetPort: 80
type: LoadBalancer
IngressService.yml:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: ingress-service
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- http:
paths:
- path: /api
backend:
serviceName: api-service
servicePort: 80
api-deployment.yml
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
selector:
app: api
ports:
- port: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-deployment
spec:
selector:
matchLabels:
app: api
tier: backend
track: stable
replicas: 1
template:
metadata:
labels:
app: api
tier: backend
track: stable
spec:
containers:
- name: api
image: image:tag
ports:
- containerPort: 80
imagePullPolicy: Always
imagePullSecrets:
- name: SECRET
The API in the image is exposed on port 80 correctly.
After applying each of the above yml services and deployments, I attempt a web request to one of the api resources via the LoadBalancer's IP and receive only a timeout on my requests.
Found my answer after hunting around enough. Basically, the problem was that the Ingress Controller has a Load Balancer built into the yaml, as mentioned in comments above. However, the selector for that LoadBalancer requires marking your Ingress service as part of the class. Then that Ingress service points to each of the services attached to your pods. I also had to make a small modification to allow using a static IP in the provided load balancer.

How to fix "503 Service Temporarily Unavailable"

FYI:
I run Kubernetes on docker desktop for mac
The website based on Nginx image
I run 2 simple website deployments on Kubetesetes and use the NodePort service. Then I want to make routing to the website using ingress. When I open the browser and access the website, I get an error 503 like images below. So, how do I fix this error?
### Service
apiVersion: v1
kind: Service
metadata:
name: app-svc
labels:
app: app1
spec:
type: NodePort
ports:
- port: 80
selector:
app: app1
---
apiVersion: v1
kind: Service
metadata:
name: app2-svc
labels:
app: app2
spec:
type: NodePort
ports:
- port: 80
selector:
app: app2
### Ingress-Rules
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: app-ingress
annotations:
ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- http:
paths:
- path: /app1
backend:
serviceName: app-svc
servicePort: 30092
- path: /app2
backend:
serviceName: app2-svc
servicePort: 30936
Yes, i end up with same error. once i changed the service type to "ClusterIP", it worked fine for me.
Found this page after searching for a solution to nginx continually returning 503 responses despite the service names all being configured correctly. The issue for me was that I had configured the kubernetes service in a specific namespace, but did not update the ingress component to be in the same namespace. Despite being such a simple solution it was not at all obvious!
I advise you to use service type ClusterIP
Take look on this useful article: services-kubernetes.
If you use Ingress you have to know that Ingress isn’t a type of Service, but rather an object that acts as a reverse proxy and single entry-point to your cluster that routes the request to different services. The most basic Ingress is the NGINX Ingress Controller, where the NGINX takes on the role of reverse proxy, while also functioning as SSL. On below drawing you can see workflow between specific components of environment objects.
Ingress is exposed to the outside of the cluster via ClusterIP and Kubernetes proxy, NodePort, or LoadBalancer, and routes incoming traffic according to the configured rules.
Example of service definition:
---
apiVersion: v1
kind: Service
metadata:
name: app-svc
labels:
app: app1
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: app1
---
apiVersion: v1
kind: Service
metadata:
name: app2-svc
labels:
app: app2
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: app2
Let me know if it helps.
First, You need to change the service type of your app-service to ClusterIP, because the Ingress object is going to access these Pods(services) from inside the cluster. (ClusterIP service is used when you want to allow accessing a pod inside a cluster).
Second, Make sure the services are running by running kubectl get services and check the running services names against the names in backend section in Ingress routing rules
Little late to this journey but here is my comment on the issue.
I was having the same issue and having the same environment. (Docker Desktop-based Kubernetes with WSL2)
a couple of items probably can help.
add the host entry in the rules section. and the value will be kubernetes.docker.internal like below
rules:
- host: kubernetes.docker.internal
http:
paths:
- path....
check the endpoints using kubectl get services to confirm that the same port is in your ingress rule definition for each of those backend services.
backend:
service:
name: my-apple-service
port:
number: 30101
kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-apple-service ClusterIP 10.106.121.95 <none> 30101/TCP 9h
my-banada-service ClusterIP 10.99.192.112 <none> 30101/TCP 9h

Service with both selector and explicit endpoints?

As part of a migration from a legacy service discovery framework to kube/CoreDNS I would like to create a Service that knows how auto-publish Endpoints, but also have manually created endpoints.
Essentially I think I would like the following:
---
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: MyApp
ports:
- protocol: TCP
port: 80
---
kind: Endpoints
apiVersion: v1
metadata:
name: my-service
annotations:
transition: legacy
subsets:
- addresses:
- ip: 1.2.3.4
ports:
- port: 9376
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-discovery-backend
labels:
app: MyApp
spec:
replicas: 1
selector:
matchLabels:
app: MyApp
template:
metadata:
labels:
app: MyApp
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
As the docs imply, though, explicitly setting things up like this results in there only being one Endpoints object associated with the service -- whether it's the service's auto-created one or my manually specified one seems to vary.
Is the most reasonable way to use CoreDNS as service discovery for both services that know how to self-publish and external services to manually control endpoints until we are 100% migrated, and then just switch over to a selector-based approach?
It can't be done using just Kubernetes Service, but I see one workaround here.
I believe, you can use additional Service backed by Nginx Pod (or deployment), manually configured (by config-map) as a reverse proxy for two backends: Kubernetes Service name (with selector, for new backend) and a manual endpoint (fqdn or IP, for legacy backend).
ingress -> Service1 -> Nginx--> Service2 -> Deployments/Pods
|
---> Endpoint -> Legacy Servers
It's up to you how to balance traffic to between them.

Ingress is not getting address on GKE and GCE

When creating ingress, no address is generated and when viewed from GKE dashboard it is always in the Creating ingress status.
Describing the ingress does not show any events and I can not see any clues on GKE dashboard.
Has anyone has a similar issue or any suggestions on how to debug?
My deployment.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: mobile-gateway-ingress
spec:
backend:
serviceName: mobile-gateway-service
servicePort: 80
---
apiVersion: v1
kind: Service
metadata:
name: mobile-gateway-service
spec:
ports:
- protocol: TCP
port: 80
targetPort: 8080
selector:
app: mobile-gateway
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mobile-gateway-deployment
labels:
app: mobile-gateway
spec:
selector:
matchLabels:
app: mobile-gateway
replicas: 2
template:
metadata:
labels:
app: mobile-gateway
spec:
containers:
- name: mobile-gateway
image: eu.gcr.io/my-project/mobile-gateway:latest
ports:
- containerPort: 8080
Describing ingress shows no events:
mobile-gateway ➤ kubectl describe ingress mobile-gateway-ingress git:master*
Name: mobile-gateway-ingress
Namespace: default
Address:
Default backend: mobile-gateway-service:80 (10.4.1.3:8080,10.4.2.3:8080)
Rules:
Host Path Backends
---- ---- --------
* * mobile-gateway-service:80 (10.4.1.3:8080,10.4.2.3:8080)
Annotations:
kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{},"name":"mobile-gateway-ingress","namespace":"default"},"spec":{"backend":{"serviceName":"mobile-gateway-service","servicePort":80}}}
Events: <none>
hello ➤
With a simple LoadBalancer service, an IP address is given. The problem is only with the ingress resource.
The problem in this case was I had did not include the addon HttpLoadBalancing when creating the cluster!
My fault but was would have been noice to have an event informing me of this mistake in the ingress resource.
Strange that when I created a new cluster to follow the tutorial cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer using default addons including HttpLoadBalancing that I observed the same issue. Maybe I didn't wait long enough? Anyway, working now that I have included the addon.
To complete the accepted answer, it's worth noting that it's possible to activate the addon on an already existing cluster (from the Google console).
However, it will restart your cluster with downtime (in my case, it took several minutes on an almost empty cluster). Be sure that's acceptable in your case, and do tests.

Kubernetes Ingress (GCE) keeps returning 502 error

I am trying to setup an Ingress in GCE Kubernetes. But when I visit the IP address and path combination defined in the Ingress, I keep getting the following 502 error:
Here is what I get when I run: kubectl describe ing --namespace dpl-staging
Name: dpl-identity
Namespace: dpl-staging
Address: 35.186.221.153
Default backend: default-http-backend:80 (10.0.8.5:8080)
TLS:
dpl-identity terminates
Rules:
Host Path Backends
---- ---- --------
*
/api/identity/* dpl-identity:4000 (<none>)
Annotations:
https-forwarding-rule: k8s-fws-dpl-staging-dpl-identity--5fc40252fadea594
https-target-proxy: k8s-tps-dpl-staging-dpl-identity--5fc40252fadea594
url-map: k8s-um-dpl-staging-dpl-identity--5fc40252fadea594
backends: {"k8s-be-31962--5fc40252fadea594":"HEALTHY","k8s-be-32396--5fc40252fadea594":"UNHEALTHY"}
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
15m 15m 1 {loadbalancer-controller } Normal ADD dpl-staging/dpl-identity
15m 15m 1 {loadbalancer-controller } Normal CREATE ip: 35.186.221.153
15m 6m 4 {loadbalancer-controller } Normal Service no user specified default backend, using system default
I think the problem is dpl-identity:4000 (<none>). Shouldn't I see the IP address of the dpl-identity service instead of <none>?
Here is my service description: kubectl describe svc --namespace dpl-staging
Name: dpl-identity
Namespace: dpl-staging
Labels: app=dpl-identity
Selector: app=dpl-identity
Type: NodePort
IP: 10.3.254.194
Port: http 4000/TCP
NodePort: http 32396/TCP
Endpoints: 10.0.2.29:8000,10.0.2.30:8000
Session Affinity: None
No events.
Also, here is the result of executing: kubectl describe ep -n dpl-staging dpl-identity
Name: dpl-identity
Namespace: dpl-staging
Labels: app=dpl-identity
Subsets:
Addresses: 10.0.2.29,10.0.2.30
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
http 8000 TCP
No events.
Here is my deployment.yaml:
apiVersion: v1
kind: Secret
metadata:
namespace: dpl-staging
name: dpl-identity
type: Opaque
data:
tls.key: <base64 key>
tls.crt: <base64 crt>
---
apiVersion: v1
kind: Service
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
spec:
type: NodePort
ports:
- port: 4000
targetPort: 8000
protocol: TCP
name: http
selector:
app: dpl-identity
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
annotations:
kubernetes.io/ingress.allow-http: "false"
spec:
tls:
- secretName: dpl-identity
rules:
- http:
paths:
- path: /api/identity/*
backend:
serviceName: dpl-identity
servicePort: 4000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: dpl-staging
name: dpl-identity
kind: Ingress
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
annotations:
kubernetes.io/ingress.allow-http: "false"
spec:
tls:
- secretName: dpl-identity
rules:
- http:
paths:
- path: /api/identity/*
backend:
serviceName: dpl-identity
servicePort: 4000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
spec:
replicas: 2
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: dpl-identity
spec:
containers:
- image: gcr.io/munpat-container-engine/dpl/identity:0.4.9
name: dpl-identity
ports:
- containerPort: 8000
name: http
volumeMounts:
- name: dpl-identity
mountPath: /data
volumes:
- name: dpl-identity
secret:
secretName: dpl-identity
Your backend k8s-be-32396--5fc40252fadea594 is showing as "UNHEALTHY".
Ingress will not forward traffic if the backend is UNHEALTHY, this will result in the 502 error you are seeing.
It will be being marked as UNHEALTHY becuase it is not passing it's health check, you can check the health check setting for k8s-be-32396--5fc40252fadea594 to see if they are appropriate for your pod, it may be polling an URI or port that is not returning a 200 response. You can find these setting under Compute Engine > Health Checks.
If they are correct then there are many steps between your browser and the container that could be passing traffic incorrectly, you could try kubectl exec -it PODID -- bash (or ash if you are using Alpine) and then try curl-ing localhost to see if the container is responding as expected, if it is and the health checks are also configured correctly then this would narrow down the issue to likely be with your service, you could then try changing the service from a NodePort type to a LoadBalancer and see if hitting the service IP directly from your browser works.
I was having the same issue. It turns out I had to wait a few minutes before ingress to validate the service health. If someone is going to the same and done all the steps like readinessProbe and linvenessProbe, just ensure your ingress is pointing to a service that is either a NodePort, and wait a few minutes until the yellow warning icon turns into a green one. Also, check the log on StackDriver to get a better idea of what's going on. My readinessProbe and livenessProbe is on /login, for the gce class. So I don't think it has to be on /healthz.
Issue is indeed a health check and seemed "random" for my apps where I used name-based virtual hosts to reverse proxy requests from ingress via domains to two separate backend services. Both were secured using Lets Encrypt and kube-lego. My solution was to standardize the path for health checks for all services sharing an ingress, and declare the readinessProbe and livenessProbe configs in my deployment.yml file.
I faced this issue with Google cloud cluster node version 1.7.8 and found this issue that closely-resembled what I experienced:
* https://github.com/jetstack/kube-lego/issues/27
I'm using gce and kube-lego and my backend service health checks were on / and kube-lego is on /healthz. It appears differing paths for health checks with gce ingress might be the cause so it may be worth updating backend services to match the /healthz pattern so all use same (or as one commenter in Github issue stated they updated kube-lego to pass on /).
I had the same problem, and it persisted after I enabled livenessProbe as well readinessPorbe.
It turned this was to do with basic auth. I've added basic auth to livenessProbe and the readinessPorbe, but turns out the GCE HTTP(S) load balancer doesn't have a configuration option for that.
There seem to be a few another kind of issue with too, e.g. setting container port to 8080 and service port to 80 didn't work with GKE ingress controller (yet I wouldn't clearly indicate what the problem was). And broadly, it looks to me like there is very little visibility and running your own ingress container is a better option with respect to visibility.
I picked Traefik for my project, it worked out of the box, and I'd like to enable Let's Encrypt integration. The only change I had to make to Traefik manifests was about tweaking the service object to disabling access to the UI from outside of the cluster and expose my app with through external load balancer (GCE TCP LB). Also, Traefik is more native to Kubernetes. I tried Heptio Contour, but something didn't work out of the box (will give it a go next time when the new version comes out).
I had the same issue. I turned out that the pod itself was running ok, which I tested via port-forwarding and accessing the health-check URL.
Port-Forward can be activated in console as follows:
$ kubectl port-forward <pod-name> local-port:pod-port
So if the pod is running ok and ingress still shows unhealthy state there might be an issue with your service configuration. In my case my app-selector was incorrect, causing the selection of a non existent pod. Interestingly this isn't showed as an errors or alerts in google console.
Definition of the pods:
#pod-definition.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: <pod-name>
namespace: <namespace>
spec:
selector:
matchLabels:
app: **<pod-name>**
template:
metadata:
labels:
app: <pod-name>
spec:
#spec-definition follows
#service.yaml
apiVersion: v1
kind: Service
metadata:
name: <name-of-service-here>
namespace: <namespace>
spec:
type: NodePort
selector:
app: **<pod-name>**
ports:
- protocol: TCP
port: 8080
targetPort: 8080
name: <port-name-here>
The "Limitations" section of the kubernetes documentation states that:
All Kubernetes services must serve a 200 page on '/', or whatever custom value you've specified through GLBC's --health-check-path argument.
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/cluster-loadbalancing/glbc#limitations
I solved the problem by
Remove the service from ingress definition
Deploy ingress kubectl apply -f ingress.yaml
Add the service to ingress definition
Deploy ingress again
Essentially, I followed Roy's advice and tried to turn it off and on again.
Log can read from Stackdriver Logging, in my case, it is backend_timeout error. After increase the default timeout (30s) via BackendConfig, it stop returning 502 even under load.
More on:
https://cloud.google.com/kubernetes-engine/docs/how-to/configure-backend-service#creating_a_backendconfig
I've fixed this issue after adding the following readiness and liveness probe with successThreshold: 1 and failureThreshold: 3 . Also i kept initialDelaySeconds to 70 because sometime an application responds bit late , it may vary per application.
NOTE: Also ensure that the path in httpGet should exist in your application(like in my case /api/books) other wise GCP pings /healthz path and doesn't guarantee to return 200 OK .
readinessProbe:
httpGet:
path: /api/books
port: 80
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
initialDelaySeconds: 70
timeoutSeconds: 60
livenessProbe:
httpGet:
path: /api/books
port: 80
initialDelaySeconds: 70
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
timeoutSeconds: 60
I could able to sort out after struggling a lot and tried many things.
Keep Learn & Share
I had the same issue when I was using a wrong image and the request couldn't be satisfied as the configurations were different.