Query Stackdriver Uptime Checks - gcloud

I am trying to query for the Stackdriver Uptime Checks using the google monitoring api. I cannot seem to find anything in their documentation that illustrates how to query for the uptime checks that were set up on stackdriver. Here are some of the docs I have been reading through. You will note that some of the query-able metrics include agent.googleapis.com/agent/uptime but this does not return the uptime checks seen on Stackdriver Uptime Checks. Below I am listing some of the documentation I have been sifting through in case it may be helpful.
Does anyone know how/if this can be done?
Google Python Client Docs
Time Series Query
Metrics

I'm a product manager on the Stackdriver team. Unfortunately, Uptime Check metrics are not currently available via the Stackdriver Metrics API. This is a feature we're actively working to provide. I'll follow-up on this thread when the feature is released.
Thank you for your question and for using Stackdriver!

It's my understanding that this metric can now be externally queried as:
monitoring.googleapis.com/uptime_check/check_passed
You can see it referenced in the sample alerting policy JSON for creating uptime check alerting policies.

Related

Cloud SQL (postgres) - Fetching metrics tagged with BETA

Have been trying to fetch metrics for my Cloud SQL (postgres) instance to get insights into query performance, but I'm unable to find a way to fetch metrics that are in BETA and ALPHA stage.
For example, the metric
database/postgresql/insights/perquery/execution_time is listed in the google cloud metrics page but does not show up in the metrics explorer.
Have tried fetching the metrics using the java sdk which seems to accept/recognise the request and the metric name but does not return any time-series data
Curious to know if BETA/ALPHA metrics needs additional configuration to be enabled?
The SQL metrics became available in the metrics explorer and the SDK after enabling Query Insights in the google cloud console.
Although this looks obvious, would be good to have a note mentioning this in the google metrics page

REST API for getting performance metrics for an HDInsight cluster?

I am looking for REST API that will allow me to fetch performance metrics for a given HDInsight (Hadoop/Linux) cluster -- such as amount or percentage of memory used by the cluster, cpu usage etc. But I haven't come across anything specific. The only closest link I have found is this. But this too doesn't have any reference to getting performance metrics. Is this info even exposed as REST API ?
According to my understanding, you want to get the metrics of the cluster. If so, you can use the following rest api to get it. For more details, please refer to the document and article
Method: GET
URL: https://<clustername>.azurehdinsight.net//api/v1/clusters/<cluster-name>?fields=metrics/load[<start time>,<end time>,<step>]
Headers : Authorization: Basic <username password>
For example:
Get CPU usage

How can I monitor my pods running on Kubernetes?

Is it possible to monitor or to get mails alerts while a pod in down? How to set the alert?
Yes, it possible you have to setup Prometheus with Alertmanager.
I recommend using prometheus-operato as an easier way to start with monitoring.
It depends if you want to use open source apps or you want to use paid software for monitoring and alerting.
As #FL3SH advised most used software to monitor and sending alerts is Prometheus and Alertmanager. This solution have many tutorials online "how to", for example this one.
However there are many other paid software to monitor your cluster/pods, alert you, create history diagrams etc. (like datadog, sysdig, Dynatrace) or mixed solutions (like Prometheus and Grafana, cAdvisor, Kibana, etc.) For more information you can check this article.
Please note that each cloud provider offers some specific monitoring features.

Stackdriver Job Monitoring -Big query or Dataflow

How we can check Slow job performance and Job recovery through Stackdriver , am looking for Dataflow or Big query jobs .
In regards to this inquiry, you can always go to the public documentation page for GCP here to ask general questions in regards to GCP.
In regards to your inquiry, I have attached an article on how you can monitor your Dataflow pipelines using Stackdriver Monitoring here.
YOu can also follow this article on how you can use Stackdriver Monitoring to monitor your jobs within BigQuery.

Flume Metrics through REST API

I'm running hortonworks 2.3 and currently hooking into the REST API through ambari to start/stop the flume service and also submit configurations.
This is all working fine, My issue is how do I get the metrics?
Previously I used to run an agent with the parameters to produce the metrics to a http port and then read them in from there using this:
-Dflume.root.logger=INFO,console
-Dflume.monitoring.type=http
-Dflume.monitoring.port=XXXXX
However now that Ambari kicks off the agent I no longer have control over this.
Any assistance appreciated :-)
Using Ambari 2.6.2.0,
http://{ipadress}:8080/api/v1/clusters/{your_cluster_name}/components/?ServiceComponentInfo/component_name=FLUME_HANDLER&fields=host_components/metrics/flume/flume
gives flume metrics breakdown by components.
Found the answer by giving a shot (and doing some cropping) to the API call provided to this JIRA issue (which complains about how slow fetching flume metrics is) https://issues.apache.org/jira/browse/AMBARI-9914?attachmentOrder=asc
Hope this helps.
I don't know if you still need the answer. That happens because Hortonworks, by default, disable JSON monitoring, they use their own metric class to send the metrics to Ambari Metrics. While you can't retrieve it from Flume directly, you still can retrieve it from Ambari REST API: https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md.
Good luck,