I have configured a kubernetes deployments in a eks cluster , the deployment contains 2 replicas , the 2 replicas are behind an Application Load balancer , I am trying to implement a grafana dashboard that shows if my application is up .
I have already configured CloudWatch as a datasource for grafana , any toughts on witch cloudwatch metrics I should use for this ?
I have used the HealthyHostCount metric , but sometimes one of my 2 replicas get down but my app is still accessible since the other replica is up , so I am looking o a ALB cloudwatch metric related ( Helathchecks on the ALB .. )
There is no CloudWatch metric for that. But if you have proper healthchek implementation, then you may use HealthyHostCount with CloudWatch metric math and calculate it, e. g. if HealthyHostCount is 0 then your AppAvailability is 0, otherwise 1. CloudWatch metric math is supported also by Grafana.
Related
Given a Kubernetes cluster that runs a certain application in a pod, is there any way to expose an internal parameter of the application (e.g., socket buffer, concurrent requests in the application, number of items in a certain application queue …, ) and then asks the Kubernetes horizontal/vertical pod autoscaler to scale up or down based on the value of such internal parameter application?
surely HPA supports custom metrics. You can push your custom metrics to prometheus and configure HPA to scale up on your metrics.
There's a more beautiful article on how to use HPA custom metrics with prometheus to scale pod. You can refer the below link for more details.
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md
I have an AKS cluster where I have deployed some microservices. I have a Prometheus system connected to that and I'm able to get a lot of metrics.
One metric that is missing for me is: The size of the requests that go in each of the microservices deployed.
Do you know how to get that metric in the environment described above?
I have two Kubernetes clusters representing dev and staging environments.
Separately, I am also deploying a custom DevOps dashboard which will be used to monitor these two clusters. On this dashboard I will need to show information such as:
RAM/HD Space/CPU usage of each deployed Pod in each environment
Pod health (as in if it has too many container restarts etc)
Pod uptime
All these stats have to be at a cluster level and also per namespace, preferably. As in, if I query a for a particular namespace, I have to get all the resource usages of that namespace.
So the webservice layer of my dashboard will send a service request to the master node of my respective cluster in order to fetch this information.
Another thing I need is to implement real time notifications in my DevOps dashboard. Every time a container fails, I need to catch that event and notify relevant personnel.
I have been reading around and two things that pop up a lot are Prometheus and Metric Server. Do I need both or will one do? I set up Prometheus on a local cluster but I can't find any endpoints it exposes which could be called by my dashboard service. I'm also trying to set up Prometheus AlertManager but so far it hasn't worked as expected. Trying to fix it now. Just wanted to check if these technologies have the capabilities to meet my requirements.
Thanks!
I don't know why you are considering your own custom monitoring system. Prometheus operator provides all the functionality that you mentioned.
You will end up only with your own grafana dashboard with all required information.
If you need custom notification you can set it up in Alertmanager creating correct prometheusrules.monitoring.coreos.com, you can find a lot of preconfigured prometheusrules in kubernetes-mixin
.
Using labels and namespaces in Alertmanager you can setup a correct route to notify person responsible for a given deployment.
Do I need both or will one do?, yes, you need both - Prometheus collects and aggregates metric when Metrick server exposes metrics from your cluster node for your Prometheus to scrape it.
If you have problems with Prometheus, Alertmanger and so on consider using helm chart as entrypoint.
Prometheus + Grafana are a pretty standard setup.
Installing kube-prometheus or prometheus-operator via helm will give you
Grafana, Alertmanager, node-exporter and kube-state-metrics by default and all be setup for kubernetes metrics.
Configure alertmanager to do something with the alerts. SMTP is usually the first thing setup but I would recommend some sort of event manager if this is a service people need to rely on.
Although a dashboard isn't part of your requirements, this will inform how you can connect into prometheus as a data source. There is docco on adding prometheus data source for grafana.
There are a number of prebuilt charts available to add to Grafana. There are some charts to visualise alertmanager too.
Your external service won't be querying the metrics directly with prometheus, in will be querying the collected data in prometheus stored inside your cluster. To access the API externally you will need to setup an external path to the prometheus service. This can be configured via an ingress controller in the helm deployment:
prometheus.ingress.enabled: true
You can do the same for the alertmanager API and grafana if needed.
alertmanager.ingress.enabled: true
grafana.ingress.enabled: true
You could use Grafana outside the cluster as your dashboard via the same prometheus ingress if it proves useful.
I have configured a working EFK(Elasticesearch,Fluentd,Kibana) in one of my kubernetes cluster builded in GCP. I have two more clusters and installed the same EFK in remaining too. Now If I want to monitor the logs of each cluster environment,then I need to check all the three kibana console. Please let me know is it possible to centralize the all EFK builded in three clusters, So that I can manage to see the pod logs from all my clusters in a single Kibana console. Any help or suggestion will be helpful.
In fact Kibana only draws and allows to sort/manage data which exists in Elasticsearch. Let's say, you have 3 k8s clusters. Consequently, you have 3 DaemonSet of Fluentd. All you should do - is configure all Fluentd deployments to send data to the one and only Elasticsearch endpoint, to which the Kibana is connected.
would like to see k8 Service level metrics in Grafana from underlying prometheus server.
For instance:
1) If i have 3 application pods exposed through a service i would like to see service level metrics for CPU,memory & network I/O pressure ,Total # of requests,# of requests failed
2)Also if i have group of pods(replicas) related to an application which doesn"t have Service on top of them would like to see the aggregated metrics of the pods related to that application in a single view on grafana
What would be the prometheus queries to achieve the same
Service level metrics for CPU, memory & network I/O pressure
If you have Prometheus installed on your Kubernetes cluster, all those statistics are being already collected by Prometheus. There are many good articles about how to install and how to use Kubernetes+Prometheus, try to check that one, as an example.
Here is an example of a request to fetch container memory usage:
container_memory_usage_bytes{image="CONTAINER:VERSION"}
Total # of requests,# of requests failed
Those are service-level metrics, and for collecting them, you need to use Prometheus Exporter created especially for your service. Check the list with exporters, find one which you need for your service and follow its instruction.
If you cannot find an Exporter for your application, you can write it yourself, here is an official documentation about it.
application which doesn"t have Service on top of them would like to see the aggregated metrics of the pods related to that application in a single view on grafana
It is possible to combine any graphics in a single view in Grafana using Dashboards and Panels. Check an official documentation, all that topics pretty detailed and easy to understand.
Aggregation can be done by Prometheus itself by aggregation operations.
All metrics from Kubernetes has labels, so you can group by them:
sum(http_requests_total) by (application, group), where application and group is labels.
Also, here is an official Prometheus instruction about how to add Prometheus to Grafana as a Datasourse.