Grafana alert execution exceeded timeout, why is this happening? - grafana

I am using Grafana to set up email alerts. I have all my panels on my dashbboard created, and just turned the alerts on. However, I am now getting the following error. Alert execution exceeded the timeout. This is sending emails for all the servers on that dashboard to everyone associated with the email alert. Why is this happening? Is there too many servers on one data source? Should I change the data source from 1 to multiple?

The question is old, but still someone comes here looking for answer.
It happens when the data source you are using to generate the charts response very slowly ~ > 30s. In that case graph throws the error execution time out.

Related

Is there a way to prevent the same alert from firing repeatedly in Azure Application Insights if the conditions haven't changed?

We are using Azure Application Insights for a project, and I want to stop App Insights from detecting the same problem multiple times in a row, if it's the same piece of data causing the same issue repeatedly:
E.G when posting a data item to a third party API, the data is an incorrect format so the API will reject it, and an Alert will be flagged. The item will return to the top of the queue, only to attempt this again an hour later, resulting in continuous alerts unless the system is shut down.
I don't want to have to answer an alert at every hour of the day, and whilst I understand a code solution would be possible to resolve this, I am hoping that there is a simpler fix that can be made in Application Insights to prevent this from occurring in a more blanket solution.

Grafana message templates

I am currently setting up grafana alerts. How do I customize my message template so my alert email shows The ip address of the server, the state of the server and the node/instance?
Thank you.
I figured it out once, then recently I updated my grafana instance that wiped my work and I had to figure it out again. It was tough the first time.
You can use the labels that are made available through prometheus in your summary and description sections in your alerts by using the syntax:
{{$labels.instance}}
{{$labels.value}}
https://prometheus.io/docs/prometheus/latest/configuration/template_examples/
The only catch is that you have to use Math expression in the last condition in your alert rule for the labels to be available in the Summary section of the alert.
For example, in our personal alerts we will use something like:
Machine {{$labels.instance}} is not reporting status via win-exporter.
The machine could be offline or the service could be stopped.

Tableau Google Ads Connector

I am reaching out to you with a question regarding connecting Google Ads source with Tableau Desktop. I saved the source and set it as an auto-refreshing extract, from a certain point I can no longer add new reports due to too many connections to the source. The content of the error below. Has anyone of you had this problem and managed to resolve it?
Error message: Unknown Failure (status code = 5054, Tableau encountered an error while communicating with Google Ads: RateExceededError.RATE_EXCEEDED. Your Ads rate limit was exceeded due to too much access in a short period of time or too much access within the last day. Try again later.

Why does test suite for smart home run out of time

With regards to the test suite, I've entered the userAgentId and the json key correctly and it progresses fine. It's when I start the test where the problem arises. Each utterance is read out to my Google Home. The Google Home wakes up to the "Ok Google. Turn off the colorful light ". The colorful light are off and reported the close status to the home graph. I link to the reportstate dashboard to confirm quickly that the status has been modified. After I waited a long time and finally got a timeout error. I don't know why and what else I need to do? I have two screenshots here. One is the request timeout and the other is the reportstatus dashboard.
timeout pic
reportstate dashboard of my colorful light
Timeouts with the test suite can be due to the following reason:
--The execution or query responses might be failing, or taking too much time.
--Report State implementation might be faulty.
The report state dashboard is an old tool that is not maintained anymore, hence the changes you see there might not represent what’s written to the homegraph accurately. Instead, can you try out using the HomeGraph Viewer.

Why does my Github webhook keep timing out?

We couldn’t deliver this payload: Service Timeout
I was successfully sending webooks to my server 5 minutes ago, and now I just keep getting timeouts. I tried deleting the webook and re-adding it, changing the URL it points to, but nothing.
Am I flooding it with too many pushes, or is GitHub's webhook service just down?
It also turns out that GitHub has a 10-second timeout set on their webhooks. That is what I ran into. See the documentation here.
Unless there is some kind of error on the GitHub side (which doesn't seem to be the case at the moment, given their "System Status" history), you might check the program receiving the payload of that webhook.
See a similar problem in Supybot-plugins 225:
I contacted GitHub support and one of the employees has been troubleshooting this for me. Here is part of what he had to say about the issue:
I just tried making a request manually from one of our machines, and that went through with no error (see curl -v output below).
However, I did notice that it took extremely long for the request to be processed -- over 15 seconds (for 2 bytes of data).
Decoupling the listening and reception of the payload, from its proicessing, is generally the right approach, as I recommended ion "Perl Script slow over Tomcat 6.0 and generates service time out".
The first part should be as fast as possible.