Kubernetes - monitor number requests - kubernetes

I have an app running on Google Container Engine.
I would like to monitor number requests per second my api is receiving
how can I do this?
is there a way monitoring from historical metrics on Stackdriver
as I am opted for Stackdriver Premium

Looking at https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/ I see that Stackdriver is deployed in the cluster by default. It looks like that, according to https://cloud.google.com/monitoring/api/metrics, this includes the metrics you are looking for.

Related

GKE is built by default in Anthos solution ? Getting Anthos Metrics

I have a cluster with 7 nodes and a lot of services, nodes, etc in the Google Cloud Platform. I'm trying to get some metrics with StackDriver Legacy, so in the Google Cloud Console -> StackDriver -> Metrics Explorer I have all the set of anthos metrics listed but when I try to create a chart based on that metrics it doesn't show the data, actually the only response that I get in the panel is no data is available for the selected time frame even changing the time frame and stuffs.
Is right to think that with anthos metrics I can retrieve information about my cronjobs, pods, services like failed initializations, jobs failures ? And if so, I can do it with StackDriver Legacy or I need to Update to StackDriver kubernetes Engine Monitoring ?
Anthos solution, includes what’s called GKE-on prem. I’d take a look at the instructions to use logging and monitoring on GKE-on prem. Stackdriver monitors GKE On-Prem clusters in a similar way as cloud-based GKE clusters.
However, there’s a note where they say that currently, Stackdriver only collects cluster logs and system component metrics. The full Kubernetes Monitoring experience will be available in a future release.
You can also check that you’ve met all the configuration requirements.

Custom cloudwatch metrics EKS CloudWatch Agent

I have set up container insights as described in the Documentation
Is there a way to remove some of the metrics sent over to CloudWatch ?
Details :
I have a small cluster ( 3 client facing namespaces, ~ 8 services per namespace ) with some custom monitoring, logging, etc in their own separate namespaces, and I just want to use CloudWatch for critical client facing metrics.
The problem I am having is that the Agent sends over 500 metrics to CloudWatch, where I am really only interested in a few of the important ones, especially as AWS bills per metric.
Is there any way to limit which metrics get sent to CloudWatch?
It would be especially helpful if I could only sent metrics from certain namespaces, for example, exclude the kube-system namespace
My configmap is:
cwagentconfig.json: |
{
"logs": {
"metrics_collected": {
"kubernetes": {
"cluster_name": "*****",
"metrics_collection_interval": 60
}
},
"force_flush_interval": 5
}
}
I have searched for a while now, but clouldn't really find anything on:
"metrics_collected": {
"kubernetes": {
I've looked as best I can and you're right, there's little or nothing to find on this topic. Before I make the obvious-but-unhelpful suggestions of either using Prometheus or asking on the AWS forums, a quick look at what the CloudWatch agent actually does.
The Cloudwatch agent gets container metrics either from from cAdvisor, which runs as part of kubelet on each node, or from the kubernetes metrics-server API (which also gets it's metrics from kubelet and cAdvisor). cAdvisor is well documented, and it's likely that the Cloudwatch agent uses the Prometheus format metrics cAdvisor produces to construct it's own list of metrics.
That's just a guess though unfortunately, since the Cloudwatch agent doesn't seem to be open source. That also means it may be possible to just set a 'measurement' option within the kubernetes section and select metrics based on Prometheus metric names, but probably that's not supported. (if you do ask AWS, the Premium Support team should keep an eye on the forums, so you might get lucky and get an answer without paying for support)
So, if you can't cut down metrics created by Container Insights, what are your other options? Prometheus is easy to deploy, and you can set up recording rules to cut down on the number of metrics it actually saves. It doesn't push to Cloudwatch by default, but you can keep the metrics locally if you have some space on your node for it, or use a remote storage service like MetricFire (the company I work for, to be clear!) which provides Grafana to go along with it. You can also export metrics from Cloudwatch and use Prometheus as your single source of truth, but that means more storage on your cluster.
If you prefer to view your metrics in Cloudwatch, there are tools like Prometheus-to-cloudwatch which actually scrape Prometheus endpoints and send data to Cloudwatch, much like (I'm guessing) the Cloudwatch Agent does. This service actually has include and exclude settings for deciding which metrics are sent to Cloudwatch.
I've written a blog post on EKS Architecture and Monitoring in case that's of any help to you. Good luck, and let us know which option you go for!

How to push mule(Java based) logs to Prometheus storage?

I have a mule application which mostly does HTTP requests, which is logging as plain text. I want to push these logs as metrics to Prometheus. Since this is a legacy application it would take a substantial amount of time to change code and push metrics directly into Prometheus storage.
Idea is to show Prometheus metrics in Grafana Dashboard.
Is there any intermediate tool that converts plain text to metrics?
Anything that helps with this requirement.
FYI- We have Nagios and Splunk which is doing this task as of now, we are looking to move our solution to Prometheus and Grafana
In situations like these you can use tools like https://github.com/fstab/grok_exporter to convert logs into metrics.

Geting custom metrics from Kubernetes pods

I was looking into Kubernetes Heapster and Metrics-server for getting metrics from the running pods. But the issue is, I need some custom metrics which might vary from pod to pod, and apparently Heapster only provides cpu and memory related metrics. Is there any tool already out there, which would provide me the functionality I want, or do I need to build one from scratch?
What you're looking for is application & infrastructure specific metrics. For this, the TICK stack could be helpful! Specifically Telegraf can be set up to gather detailed infrastructure metrics like Memory- and CPU pressure or even the resources used by individual docker containers, network and IO metrics etc... But it can also scrape Prometheus metrics from pods. These metrics are then shipped to influxdb and visualized using either chronograph or grafana.
Not sure if this is still open.
I would classify metrics into 3 types.
Events or Logs - System and Applications events which are sent to logs. These are non-deterministic.
Metrics - CPU and Memory utilization on the node the app is hosted. This is deterministic and are collected periodically.
APM - Applicaton Performance Monitoring metrics - these are application level metrics like requests received vs failed vs responded etc.
Not all the platforms do everything. ELK for instance does both the Metrics and Log Monitoring and does not do APM. Some of these tools have plugins into collect daemons which collect perfmon metrics of the node.
APM is a completely different area as it requires developer tool to provider metrics as Springboot does Actuator, Nodejs does AppMetrics etc. This carries the request level data. Statsd is an open source library which application can consume to provide APM metrics too Statsd agents installed in the node.
AWS offers CloudWatch agents for log shipping and sink and Xray for distributed tracing which can be used for APM.

How to go about logging in GKE without using Stackdriver

We are unable to grab logs from our GKE cluster running containers if StackDriver is disabled on GCP. I understand that it is proxying stderr/stdout but it seems rather heavy handed to block these outputs when Stackdriver is disabled.
How does one get an ELF stack going on GKE without being billed for StackDriver aka disabling it entirely? or is it so much a part of GKE that this is not doable?
From the article linked on a similar question regarding GCP:
"Kubernetes doesn’t specify a logging agent, but two optional logging agents are packaged with the Kubernetes release: Stackdriver Logging for use with Google Cloud Platform, and Elasticsearch. You can find more information and instructions in the dedicated documents. Both use fluentd with custom configuration as an agent on the node." (https://kubernetes.io/docs/concepts/cluster-administration/logging/#exposing-logs-directly-from-the-application)
Perhaps our understanding of Stackdriver billing is wrong?
But we don't want to be billed for Stackdriver as the 150MB of logs outside of the GCP metrics is not going to be enough and we have some expertise in setting up ELF for logging that we'd like to use.
You can disable Stackdriver logging/monitoring on Kubernetes by editing your cluster, and setting "Stackdriver Logging" and "Stackdriver Monitoring" to disable.
I would still suggest sticking to GCP over AWS as you get the whole Kube as a service experience. Amazon's solution is still a little way off, and they are planning charging for the service in addition to the EC2 node prices (Last I heard).