How to go about logging in GKE without using Stackdriver - kubernetes

We are unable to grab logs from our GKE cluster running containers if StackDriver is disabled on GCP. I understand that it is proxying stderr/stdout but it seems rather heavy handed to block these outputs when Stackdriver is disabled.
How does one get an ELF stack going on GKE without being billed for StackDriver aka disabling it entirely? or is it so much a part of GKE that this is not doable?
From the article linked on a similar question regarding GCP:
"Kubernetes doesn’t specify a logging agent, but two optional logging agents are packaged with the Kubernetes release: Stackdriver Logging for use with Google Cloud Platform, and Elasticsearch. You can find more information and instructions in the dedicated documents. Both use fluentd with custom configuration as an agent on the node." (https://kubernetes.io/docs/concepts/cluster-administration/logging/#exposing-logs-directly-from-the-application)
Perhaps our understanding of Stackdriver billing is wrong?
But we don't want to be billed for Stackdriver as the 150MB of logs outside of the GCP metrics is not going to be enough and we have some expertise in setting up ELF for logging that we'd like to use.

You can disable Stackdriver logging/monitoring on Kubernetes by editing your cluster, and setting "Stackdriver Logging" and "Stackdriver Monitoring" to disable.
I would still suggest sticking to GCP over AWS as you get the whole Kube as a service experience. Amazon's solution is still a little way off, and they are planning charging for the service in addition to the EC2 node prices (Last I heard).

Related

How can I check if a resource inside Kubernetes has been deleted for some reason?

I am a junior developer currently running a service in a Kubernetes environment.
How can I check if a resource inside Kubernetes has been deleted for some reason?
As a simple example, if a deployment is deleted, I want to know which user deleted it.
Could you please tell me which log to look at.
And I would like to know how to collect these logs.
I don't have much experience yet, so I'm asking for help.
Also, if you have a reference or link, please share it. It will be very helpful to me.
Thank you:)
Start with enabling audit with lots of online resources about doing this.
If you are on AWS and using EKS I would suggest enabling "Amazon EKS control plane logging" By enabling it you can enable audit and diagnostic logs streaming in AWS cloudwatch logs, which are more easily accessible, and useful for audit and compliance requirements. Control plane logs make it easy for you to secure and run your clusters and make the entire system more audiatable.
As per AWS documentation:
Kubernetes API server component logs (api) – Your cluster's API server is the control plane component that exposes the Kubernetes API. For more information, see kube-apiserver in the Kubernetes documentation.
Audit (audit) – Kubernetes audit logs provide a record of the individual users, administrators, or system components that have affected your cluster. For more information, see Auditing in the Kubernetes documentation.
Authenticator (authenticator) – Authenticator logs are unique to Amazon EKS. These logs represent the control plane component that Amazon EKS uses for Kubernetes Role-Based Access Control (RBAC) authentication using IAM credentials. For more information, see Cluster authentication.
Controller manager (controllerManager) – The controller manager manages the core control loops that are shipped with Kubernetes. For more information, see kube-controller-manager in the Kubernetes documentation.
Scheduler (scheduler) – The scheduler component manages when and where to run pods in your cluster. For more information, see kube-scheduler in the Kubernetes documentation.
Reference: https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html

GKE is built by default in Anthos solution ? Getting Anthos Metrics

I have a cluster with 7 nodes and a lot of services, nodes, etc in the Google Cloud Platform. I'm trying to get some metrics with StackDriver Legacy, so in the Google Cloud Console -> StackDriver -> Metrics Explorer I have all the set of anthos metrics listed but when I try to create a chart based on that metrics it doesn't show the data, actually the only response that I get in the panel is no data is available for the selected time frame even changing the time frame and stuffs.
Is right to think that with anthos metrics I can retrieve information about my cronjobs, pods, services like failed initializations, jobs failures ? And if so, I can do it with StackDriver Legacy or I need to Update to StackDriver kubernetes Engine Monitoring ?
Anthos solution, includes what’s called GKE-on prem. I’d take a look at the instructions to use logging and monitoring on GKE-on prem. Stackdriver monitors GKE On-Prem clusters in a similar way as cloud-based GKE clusters.
However, there’s a note where they say that currently, Stackdriver only collects cluster logs and system component metrics. The full Kubernetes Monitoring experience will be available in a future release.
You can also check that you’ve met all the configuration requirements.

How to Send On Premises Kubernetes Logs to Stackdriver

Objective: Get some logging/monitoring on Googles
Stackdriver from a Kuberntes HA cluster
that is on premises, version 1.11.2.
I have been able to send logs to Elasticsearch using Fluentd Daemonset for
Kubernetes, but the
project is not supporting Stackdriver
(issue).
That said, there is a docker image created for Stackdriver
(source),
but it does not have the daemonset. Looking at other daemonsets in this
repository, there are similarities between the different fluent.conf files
with the exception of the Stackdriver fluent.conf file that is missing any
environment variables.
As noted in the GitHub
issue
mentioned above there is a plugin located in the Kubernetes GitHub
here,
but it is legacy.
The docs can be found
here.
It states:
"Warning: The Stackdriver logging daemon has known issues on
platforms other than Google Kubernetes Engine. Proceed at your own risk."
Installing in this manner fails, without indication of why.
Some other notes. There is Stackdriver Kubernetes
Monitoring that clearly
states:
"Easy to get started on any cloud or on-prem"
on the front page, but
doesn't seem to explain how. This Stack Overflow
question
has someone looking to add the monitoring to his AWS cluster. It seems that it is not yet supported.
Furthermore, on the actual Google
Stackdriver it is also stated that
"Works with multiple clouds and on-premises infrastructure".
Of note, I am new to Fluentd and the Google Cloud Platform, but am pretty
familiar with administering an on-premise Kubernetes cluster.
Has anyone been able to get monitoring or logging to work on GCP from another platform? If so, what method was used?
Consider reviewing this documentation for using the BindPlane managed fluentd service from Google partner Blue Medora. It is available in Alpha to all Stackdriver users. It parses/forwards Kubernetes logs to Stackdriver, with additional payload markup.
Disclaimer: I am employed by Blue Medora.
Check out the new Stackdriver BindPlane integration which provides on-premise log capabilities.
It is fully supported by Google and is free (other than typical Stackdriver consumption fees)
https://cloud.google.com/solutions/logging-on-premises-resources-with-stackdriver-and-blue-medora

Get request count from Kubernetes service

Is there any way to get statistics such as service / endpoint access for services defined in Kubernetes cluster?
I've read about Heapster, but it doesn't seem to provide these statistics. Plus, the whole setup is tremendously complicated and relies on a ton of third-party components. I'd really like something much, much simpler than that.
I've been looking into what may be available in kube-system namespace, and there's a bunch of containers and services, there, Heapster including, but they are effectively inaccessible because they require authentication I cannot provide, and kubectl doesn't seem to have any API to access them (or does it?).
Heapster is the agent that collects data, but then you need a monitoring agent to interpret these data. On GCP, for example, that's fluentd who gets these metrics and sends to Stackdriver.
Prometheus is an excellent monitoring tool. I would recommend this one, if youare not on GCP.
If you would be on GCP, then as mentioned above you have Stackdriver Monitoring, that is configured by default for K8s clusters. All you have to do is to create a Stackdriver accound (this is done by one click from GCP Console), and you are good to go.

How do you monitor kubernetes nodes deployed using kops?

We have some Kubernetes clusters that have been deployed using kops in AWS.
We really like using the upstream/official images.
We have been wondering whether or not there was a good way to monitor the systems without installing software directly on the hosts? Are there docker containers that can extract the information from the host? I think that we are likely concerned with:
Disk space (this seems to be passed through to docker via df
Host CPU utilization
Host memory utilization
Is this host/node level information already available through heapster?
Not really a question about kops, but a question about operating Kubernetes. kops stops at the point of having a functional k8s cluster. You have networking, DNS, and nodes have joined the cluster. From there your world is your oyster.
There are many different options for monitoring with k8s. If you are a small team I usually recommend offloading monitoring and logging to a provider.
If you are a larger team or have more specific needs then you can look at such options as Prometheus and others. Poke around in the https://github.com/kubernetes/charts repository, as I know there is a Prometheus chart there.
As with any deployment of any form of infrastructure you are going to need Logging, Monitoring, and Metrics. Also, do not forget to monitor the monitoring ;)
I am using https://prometheus.io/, it goes naturally with kubernetes.
Kubernetes api already exposes a bunch of metrics in prometheus format,
https://github.com/kubernetes/ingress-nginx also exposes prometheus metrics (enable-vts-status: "true"), and you can also install https://github.com/prometheus/node_exporter as a daemonset to monitor CPU, disk, etc...
I install one prometheus inside the cluster to monitor internal metrics and one outside the cluster to monitor LBs and URLs.
Both send alerts to the same https://github.com/prometheus/alertmanager that MUST be outside the cluster.
It took me about a week to configure everything properly.
It was worth it.