I have a bunch of microservices running in a kubernetes cluster where each microservice implements a basic health check over HTTP.
e.g for the endpoint /health each service will return a HTTP response 200 if that particular service is currently healthy or some other HTPP 4xx / 5xx code (and possible additional info) if not healthy.
I see Kubernetes has its own built in concepth of a HTTP health check https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-http-request
Unfortunatley its not quite what I want. I like to be able to trigger an alert (and record the state of the health check request) in some database so I can quickly check what state all my services are in as well as alerting on any services in an unhealthy state.
I'm wondering are there existing tools or approaches in Kubernetes I should use for this sort of thing? Or will need to build some custom solution for this.
Was considering having a general "HealthCheck" service which each microservice would register with when started. That way the "HealthCheck" service would monitor the health of each service as well as trigerring alerts for any issues it finds.
I would caution against trying to build your own in-house monitoring solution. There are considerable drawbacks to that approach.
If all you need is external service HTTP health checks, then many existing monitoring solutions will do fine. You can either install a traditional IT solution like Zabbix or Nagios. Or use a SAS like Datadog and others.
There are also blackbox extensions for Prometheus, which is very popular among K8s users.
Many of these options require a learning curve of some steepness.
Related
I have read this question which is very similar to what I am asking, but still wanted to write a new question since the accepted answer there seems very incomplete and also potentially wrong.
Basically, it seems like there is some missing or contradictory information regarding built in load-balancing for regular Kubernetes Services (I am not talking about LoadBalancer services). For example, the official Cilium documentation states that "Kubernetes doesn't come with an implementation of Load Balancing". In addition, I couldn't find any information in the official Kubernetes documentation about load balancing for internal services (there was only a section discussing this under ingresses).
So my question is - how does load balancing or distribution of requests work when we make a request from within a Kubernetes cluster to the internal address of a Kubernetes service?
I know there's a Kubernetes proxy on each node that creates the DNS records for such services, but what about services that span multiple pods and nodes? There's got to be some form of request distribution or load-balancing, or else this just wouldn't work at all, no?
A standard Kubernetes Service provides basic load-balancing. Even for a ClusterIP-type Service, the Service has its own cluster-internal IP address and DNS name, and forwards requests to the collection of Pods specified by its selector:.
In normal use, it is enough to create a multiple-replica Deployment, set a Service to point at its Pods, and send requests only to the Service. All of the replicas will receive requests.
The documentation discusses the implementation of internal load balancing in more detail than an application developer normally needs. Unless your cluster administrator has done extra setup, you'll probably get round-robin request routing – the first Pod will receive the first request, the second Pod the second, and so on.
... the official Cilium documentation states ...
This is almost certainly a statement about external load balancing. As a cluster administrator (not a programmer) a "plain" Kubernetes installation doesn't include an external load-balancer implementation, and a LoadBalancer-type Service behaves identically to a NodePort-type Service.
There are obvious deficiencies to round-robin scheduling, most notably if you do wind up having individual network requests that take a long time and a lot of resource to service. As an application developer the best way to address this is to make these very-long-running requests run asynchronously; return something like an HTTP 201 Created status with a unique per-job URL, and do the actual work in a separate queue-backed worker.
Very simple question, I have a front-end app running in kubernetes. I would like to create a back-end containerized app that would also be in kubernetes obviously.
User actions in the frontend would need to trigger the execution of a command on the backend (echo success! for example). The UI also needs to know what was the command's output.
What is the best way to implement this in k8s?
Either through an internal service, or the two apps can also be in the same pods.
Perhaps there is some kind of messaging involved with applications such as rabbitMQ?
That depends on your application how you are planning.
Some people host frontend on bucket and from there send HTTP request to backend or so.
You can keep frontend and backend in different PODs or in a single POD also.
For example, if you are using the Node JS with express, you can run as simple API service POD also and keep frontend with it also to serve.
You can use the K8s service name for internal communication instead of adding the Message broker(RabbitMQ, Redis also can be used) unless your web app really needs it.
I would also recommend checking out the : https://learnk8s.io/deploying-nodejs-kubernetes
Github repo of application : https://github.com/learnk8s/knote-js/tree/master/01
Official example : https://kubernetes.io/docs/tutorials/stateless-application/guestbook/
We are developing tons of micro-services. They all run in Kubernetes. As ops, I need to define probes for each micro-service. So we will create a health check API for each micro-service. What are the best practices for this API? What are the best practices for probes? Do we need to check the service's health only or the database connection too (and more)? Is it redundant? The databases are in Kubernetes too, and have their own probes too. Can we just use the /version API as the probe?
I'm looking for feedback and documentation. Thank you.
An argument for including databases and other downstream dependencies in the health check is the following:
Assume you have a load balancer exposing some number of micro-services to the outside world. If due to a large amount of load the database of one of these micro-services goes down, and this is not included in the health check of the micro-service, the load balancer will still try to direct traffic to micro-service, further increasing the problem the database is experiencing.
If instead the health-check included downstream dependencies, the load-balancer would stop directing traffic to the micro-service (and hopefully show a nice error message to the user). This would give the database time to restore from the increase in load (and ops time to react).
So I would argue that using a basic /version is not a good idea.
A microservice generally calls other microservices/services to retrieve data, and there is the chance that the downstream service may be down. You can use the "Circuit Breaker Pattern". This pattern is suited to, prevent an application from trying to invoke a remote service or access a shared resource if this operation is highly likely to fail.
You will find a pattern in Observability Patterns (/Health Check) in Microservices. Each service needs to have an endpoint that can be used to check the health of the application, such as /health.1
We have a service with multiple replicas which works with storage without transactions and blocking approaches. So we need somehow to synchronize concurrent requests between multiple instances by some "sharding" key. Right now we host this service in Kubernetes environment as a ReplicaSet.
Don't you know any simple out-of-the-box approaches on how to do this to not implement it from scratch?
Here are several of our ideas on how to do this:
Deploy the service as a StatefulSet and implement some proxy API which will route traffic to the specific pod in this StatefulSet by sharding key from the HTTP request. In this scenario, all requests which should be synchronized will be handled by one instance and it wouldn't be a problem to handle this case.
Deploy the service as a StatefulSet and implement some custom logic in the same service to re-route traffic to the specific instance (or process on this exact instance). As I understand it's not possible to have abstract implementation and it would work only in Kubernetes environment.
Somehow expose each pod IP outside the cluster and implement routing logic on the client-side.
Just implement synchronization between instances through some third-party service like Redis.
I would like to try to route traffic to the specific pods. If you know standard approaches how to handle this case I'll be much appreciated.
Thank you a lot in advance!
Another approach would be to put a messaging queue (like Kafka and RabbitMq) in front of your service.
Then your pods will subscribe to the MQ topic/stream. The pod will decide if it should process the message or not.
Also, try looking into service meshes like Istio or Linkerd.
They might have an OOTB solution for your use-case, although I wasn't able to find one.
Remember that Network Policy is not traffic routing !
Pods are intended to be stateless and indistinguishable from one another, pod-networking.
I recommend to Istio. It has special component which is responsible or routing- Envoy. It is a high-performance proxy developed in C++ to mediate all inbound and outbound traffic for all services in the service mesh.
Useful article: istio-envoy-proxy.
Istio documentation: istio-documentation.
Useful Istio explaination https://www.youtube.com/watch?v=e2kowI0fAz0.
But you should be able to create a Deployment per customer group, and a Service per Deployment. The Ingress nginx should be able to be told to map incoming requests by whatever attributes are relevant to specific customer group Services.
Other solution is to use kube-router.
Kube-router can be run as an agent or a Pod (via DaemonSet) on each node and leverages standard Linux technologies iptables, ipvs/lvs, ipset, iproute2.
Kube-router uses IPVS/LVS technology built in Linux to provide L4 load balancing. Each ClusterIP, NodePort, and LoadBalancer Kubernetes Service type is configured as an IPVS virtual service. Each Service Endpoint is configured as real server to the virtual service. The standard ipvsadm tool can be used to verify the configuration and monitor the active connections.
How it works: service-proxy.
I have some servers I want to deploy in Kubernetes. The clients of those servers will also be in Kubernetes. Clients and servers can independently be deployed or scaled.
The clients must know the list of the servers (IPs). I have an HTTP endpoint on the clients to update the list of the servers while the clients are running (hot config reload).
All this is currently running outside of Kubernetes. I want to migrate to GCP.
What's the industry standard regarding pods updates and notifications? I want to get notified when servers are updated to call the endpoints on the clients to update the list of the servers.
Can't use a LoadBalancer since the clients really need to call a specific server (business logic are in the clients).
Thanks
The standard for calling a group of pods that offer a functionality is services. If you don't want automated load-balancing or a single IP address, which regular services do, you should look into headless services. Calling headless services returns a list of DNS A records that point to the pods behind the service. This list is automatically updated as pods become available/unavailable.
While I think modifying an existing script to just pull a list from a headless is much simpler, it might be worth mentioning CRDs (Custom Resource Definitions) as well.
You could build a custom controller that listens to service events and then posts the data from that event to an HTTP endpoint of another Service or Ingress. The custom resource would define which service to watch and where to post the results.
Though, this is probably much heavier weight solution that just having a sidecar / separate container in a pod polling the service for changes (which sounds closer to you existing model).
I upvoted Alassane answer as I think it is the correct first path to something like this before building a CRD.