GKE, Stackdriver Kubernetes Engine Monitoring and custom log format - kubernetes

I'm trying to use Stackdriver Kubernetes Engine Monitoring to monitor my GKE cluster (as opposed to Legacy Stackdriver), as this is recommended by https://cloud.google.com/monitoring/kubernetes-engine/ page (despite being beta).
However, I'd like to use custom log parses, as some of my containers use proprietary log formats.
My understanding:
it's a matter of customizing fluentd config map (in my case it's fluentd-gcp-config-v1.2.6 in kube-system namespace), however this config map is managed by GKE and can be replaced anytime.
with Legacy Stackdriver it was possible to disable the logging alone, and deploy fluentd-gcp manually - as described e.g. here: https://cloud.google.com/solutions/customizing-stackdriver-logs-fluentd
. However, with new fancy Stackdriver Kubernetes Engine Monitoring it is not possible to disable logging anymore (and only leave metrics support).
Is there any other way to support custom log formats?

Related

Is it possible/fine to run Prometheus, Loki, Grafana outside of Kubernetes?

In some project there are scaling and orchestration implemented using technologies of a local cloud provider, with no Docker & Kubernetes. But the project has poor logging and monitoring, I'd like to instal Prometheus, Loki, and Grafana for metrics, logs, and visualisation respectively. Unfortunately, I've found no articles with instructions about using Prometheus without K8s.
But is it possible? If so, is it a good way? And how to do this? I also know that Prometheus & Loki can automatically detect services in the K8s to extract metrics and logs, but will the same work for a custom orchestration system?
Can't comment about Loki, but Prometheus is definitely doable.
Prometheus supports a number of service discovery mechanisms, k8s being just on of them. If you look at the list of options (the ones ending with _sd_config) you can see if your provider is there.
If it is not then a generic service discovery can be used. Maybe DNS-based discovery will work with your custom system? If not then with some glue code a file based service discovery will almost certainly work.
Yes, I'm running Prometheus, Loki etc. just fine in a AWS ECS cluster. It just requires a bit more configuration especially regarding service discovery (if you are not already using something like ECS Service Disovery or Hashicorp Consul)

GKE is built by default in Anthos solution ? Getting Anthos Metrics

I have a cluster with 7 nodes and a lot of services, nodes, etc in the Google Cloud Platform. I'm trying to get some metrics with StackDriver Legacy, so in the Google Cloud Console -> StackDriver -> Metrics Explorer I have all the set of anthos metrics listed but when I try to create a chart based on that metrics it doesn't show the data, actually the only response that I get in the panel is no data is available for the selected time frame even changing the time frame and stuffs.
Is right to think that with anthos metrics I can retrieve information about my cronjobs, pods, services like failed initializations, jobs failures ? And if so, I can do it with StackDriver Legacy or I need to Update to StackDriver kubernetes Engine Monitoring ?
Anthos solution, includes what’s called GKE-on prem. I’d take a look at the instructions to use logging and monitoring on GKE-on prem. Stackdriver monitors GKE On-Prem clusters in a similar way as cloud-based GKE clusters.
However, there’s a note where they say that currently, Stackdriver only collects cluster logs and system component metrics. The full Kubernetes Monitoring experience will be available in a future release.
You can also check that you’ve met all the configuration requirements.

How to Send On Premises Kubernetes Logs to Stackdriver

Objective: Get some logging/monitoring on Googles
Stackdriver from a Kuberntes HA cluster
that is on premises, version 1.11.2.
I have been able to send logs to Elasticsearch using Fluentd Daemonset for
Kubernetes, but the
project is not supporting Stackdriver
(issue).
That said, there is a docker image created for Stackdriver
(source),
but it does not have the daemonset. Looking at other daemonsets in this
repository, there are similarities between the different fluent.conf files
with the exception of the Stackdriver fluent.conf file that is missing any
environment variables.
As noted in the GitHub
issue
mentioned above there is a plugin located in the Kubernetes GitHub
here,
but it is legacy.
The docs can be found
here.
It states:
"Warning: The Stackdriver logging daemon has known issues on
platforms other than Google Kubernetes Engine. Proceed at your own risk."
Installing in this manner fails, without indication of why.
Some other notes. There is Stackdriver Kubernetes
Monitoring that clearly
states:
"Easy to get started on any cloud or on-prem"
on the front page, but
doesn't seem to explain how. This Stack Overflow
question
has someone looking to add the monitoring to his AWS cluster. It seems that it is not yet supported.
Furthermore, on the actual Google
Stackdriver it is also stated that
"Works with multiple clouds and on-premises infrastructure".
Of note, I am new to Fluentd and the Google Cloud Platform, but am pretty
familiar with administering an on-premise Kubernetes cluster.
Has anyone been able to get monitoring or logging to work on GCP from another platform? If so, what method was used?
Consider reviewing this documentation for using the BindPlane managed fluentd service from Google partner Blue Medora. It is available in Alpha to all Stackdriver users. It parses/forwards Kubernetes logs to Stackdriver, with additional payload markup.
Disclaimer: I am employed by Blue Medora.
Check out the new Stackdriver BindPlane integration which provides on-premise log capabilities.
It is fully supported by Google and is free (other than typical Stackdriver consumption fees)
https://cloud.google.com/solutions/logging-on-premises-resources-with-stackdriver-and-blue-medora

Get request count from Kubernetes service

Is there any way to get statistics such as service / endpoint access for services defined in Kubernetes cluster?
I've read about Heapster, but it doesn't seem to provide these statistics. Plus, the whole setup is tremendously complicated and relies on a ton of third-party components. I'd really like something much, much simpler than that.
I've been looking into what may be available in kube-system namespace, and there's a bunch of containers and services, there, Heapster including, but they are effectively inaccessible because they require authentication I cannot provide, and kubectl doesn't seem to have any API to access them (or does it?).
Heapster is the agent that collects data, but then you need a monitoring agent to interpret these data. On GCP, for example, that's fluentd who gets these metrics and sends to Stackdriver.
Prometheus is an excellent monitoring tool. I would recommend this one, if youare not on GCP.
If you would be on GCP, then as mentioned above you have Stackdriver Monitoring, that is configured by default for K8s clusters. All you have to do is to create a Stackdriver accound (this is done by one click from GCP Console), and you are good to go.

How to go about logging in GKE without using Stackdriver

We are unable to grab logs from our GKE cluster running containers if StackDriver is disabled on GCP. I understand that it is proxying stderr/stdout but it seems rather heavy handed to block these outputs when Stackdriver is disabled.
How does one get an ELF stack going on GKE without being billed for StackDriver aka disabling it entirely? or is it so much a part of GKE that this is not doable?
From the article linked on a similar question regarding GCP:
"Kubernetes doesn’t specify a logging agent, but two optional logging agents are packaged with the Kubernetes release: Stackdriver Logging for use with Google Cloud Platform, and Elasticsearch. You can find more information and instructions in the dedicated documents. Both use fluentd with custom configuration as an agent on the node." (https://kubernetes.io/docs/concepts/cluster-administration/logging/#exposing-logs-directly-from-the-application)
Perhaps our understanding of Stackdriver billing is wrong?
But we don't want to be billed for Stackdriver as the 150MB of logs outside of the GCP metrics is not going to be enough and we have some expertise in setting up ELF for logging that we'd like to use.
You can disable Stackdriver logging/monitoring on Kubernetes by editing your cluster, and setting "Stackdriver Logging" and "Stackdriver Monitoring" to disable.
I would still suggest sticking to GCP over AWS as you get the whole Kube as a service experience. Amazon's solution is still a little way off, and they are planning charging for the service in addition to the EC2 node prices (Last I heard).