I have a cluster with 7 nodes and a lot of services, nodes, etc in the Google Cloud Platform. I'm trying to get some metrics with StackDriver Legacy, so in the Google Cloud Console -> StackDriver -> Metrics Explorer I have all the set of anthos metrics listed but when I try to create a chart based on that metrics it doesn't show the data, actually the only response that I get in the panel is no data is available for the selected time frame even changing the time frame and stuffs.
Is right to think that with anthos metrics I can retrieve information about my cronjobs, pods, services like failed initializations, jobs failures ? And if so, I can do it with StackDriver Legacy or I need to Update to StackDriver kubernetes Engine Monitoring ?
Anthos solution, includes what’s called GKE-on prem. I’d take a look at the instructions to use logging and monitoring on GKE-on prem. Stackdriver monitors GKE On-Prem clusters in a similar way as cloud-based GKE clusters.
However, there’s a note where they say that currently, Stackdriver only collects cluster logs and system component metrics. The full Kubernetes Monitoring experience will be available in a future release.
You can also check that you’ve met all the configuration requirements.
Related
Objective: Get some logging/monitoring on Googles
Stackdriver from a Kuberntes HA cluster
that is on premises, version 1.11.2.
I have been able to send logs to Elasticsearch using Fluentd Daemonset for
Kubernetes, but the
project is not supporting Stackdriver
(issue).
That said, there is a docker image created for Stackdriver
(source),
but it does not have the daemonset. Looking at other daemonsets in this
repository, there are similarities between the different fluent.conf files
with the exception of the Stackdriver fluent.conf file that is missing any
environment variables.
As noted in the GitHub
issue
mentioned above there is a plugin located in the Kubernetes GitHub
here,
but it is legacy.
The docs can be found
here.
It states:
"Warning: The Stackdriver logging daemon has known issues on
platforms other than Google Kubernetes Engine. Proceed at your own risk."
Installing in this manner fails, without indication of why.
Some other notes. There is Stackdriver Kubernetes
Monitoring that clearly
states:
"Easy to get started on any cloud or on-prem"
on the front page, but
doesn't seem to explain how. This Stack Overflow
question
has someone looking to add the monitoring to his AWS cluster. It seems that it is not yet supported.
Furthermore, on the actual Google
Stackdriver it is also stated that
"Works with multiple clouds and on-premises infrastructure".
Of note, I am new to Fluentd and the Google Cloud Platform, but am pretty
familiar with administering an on-premise Kubernetes cluster.
Has anyone been able to get monitoring or logging to work on GCP from another platform? If so, what method was used?
Consider reviewing this documentation for using the BindPlane managed fluentd service from Google partner Blue Medora. It is available in Alpha to all Stackdriver users. It parses/forwards Kubernetes logs to Stackdriver, with additional payload markup.
Disclaimer: I am employed by Blue Medora.
Check out the new Stackdriver BindPlane integration which provides on-premise log capabilities.
It is fully supported by Google and is free (other than typical Stackdriver consumption fees)
https://cloud.google.com/solutions/logging-on-premises-resources-with-stackdriver-and-blue-medora
We are unable to grab logs from our GKE cluster running containers if StackDriver is disabled on GCP. I understand that it is proxying stderr/stdout but it seems rather heavy handed to block these outputs when Stackdriver is disabled.
How does one get an ELF stack going on GKE without being billed for StackDriver aka disabling it entirely? or is it so much a part of GKE that this is not doable?
From the article linked on a similar question regarding GCP:
"Kubernetes doesn’t specify a logging agent, but two optional logging agents are packaged with the Kubernetes release: Stackdriver Logging for use with Google Cloud Platform, and Elasticsearch. You can find more information and instructions in the dedicated documents. Both use fluentd with custom configuration as an agent on the node." (https://kubernetes.io/docs/concepts/cluster-administration/logging/#exposing-logs-directly-from-the-application)
Perhaps our understanding of Stackdriver billing is wrong?
But we don't want to be billed for Stackdriver as the 150MB of logs outside of the GCP metrics is not going to be enough and we have some expertise in setting up ELF for logging that we'd like to use.
You can disable Stackdriver logging/monitoring on Kubernetes by editing your cluster, and setting "Stackdriver Logging" and "Stackdriver Monitoring" to disable.
I would still suggest sticking to GCP over AWS as you get the whole Kube as a service experience. Amazon's solution is still a little way off, and they are planning charging for the service in addition to the EC2 node prices (Last I heard).
We have some Kubernetes clusters that have been deployed using kops in AWS.
We really like using the upstream/official images.
We have been wondering whether or not there was a good way to monitor the systems without installing software directly on the hosts? Are there docker containers that can extract the information from the host? I think that we are likely concerned with:
Disk space (this seems to be passed through to docker via df
Host CPU utilization
Host memory utilization
Is this host/node level information already available through heapster?
Not really a question about kops, but a question about operating Kubernetes. kops stops at the point of having a functional k8s cluster. You have networking, DNS, and nodes have joined the cluster. From there your world is your oyster.
There are many different options for monitoring with k8s. If you are a small team I usually recommend offloading monitoring and logging to a provider.
If you are a larger team or have more specific needs then you can look at such options as Prometheus and others. Poke around in the https://github.com/kubernetes/charts repository, as I know there is a Prometheus chart there.
As with any deployment of any form of infrastructure you are going to need Logging, Monitoring, and Metrics. Also, do not forget to monitor the monitoring ;)
I am using https://prometheus.io/, it goes naturally with kubernetes.
Kubernetes api already exposes a bunch of metrics in prometheus format,
https://github.com/kubernetes/ingress-nginx also exposes prometheus metrics (enable-vts-status: "true"), and you can also install https://github.com/prometheus/node_exporter as a daemonset to monitor CPU, disk, etc...
I install one prometheus inside the cluster to monitor internal metrics and one outside the cluster to monitor LBs and URLs.
Both send alerts to the same https://github.com/prometheus/alertmanager that MUST be outside the cluster.
It took me about a week to configure everything properly.
It was worth it.
I have an app running on Google Container Engine.
I would like to monitor number requests per second my api is receiving
how can I do this?
is there a way monitoring from historical metrics on Stackdriver
as I am opted for Stackdriver Premium
Looking at https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/ I see that Stackdriver is deployed in the cluster by default. It looks like that, according to https://cloud.google.com/monitoring/api/metrics, this includes the metrics you are looking for.
I'm running Kubernetes myself on Google Compute Engine (not Google Container Engine). Google Container Engine has built-in integration with Stackdriver Monitoring and I'm wondering if it's possible to set this up for a Kubernetes cluster on Google Compute Engine.
Specifically, I'd like to see more than just cpu, disk, etc. I want to see Kubernetes data like pod scheduling failures, pod counts, etc.
It is not possible to configure Stackdriver exactly the same way that it is done in GKE.
However, you can set ENABLE_CLUSTER_MONITORING to google in config-default.sh to enable Heapster and Google Cloud Monitoring.