Email alert on AWS for webservice health

Email alert on AWS for webservice health - rest

I have deployed a webservice on AWS EC2 instances.
I have also implemented a rest call /getStatuswhich returns status of modules in my service in JSON format like Connection status of DB, ActiveMQ cache status etc.
I want a way to creat automatic email trigger which will send mail when there is any issue found in response of /getStatus rest call.
I am looking if its possible using cloudwatch but any other sugestions are welcome

One solution is to make the endpoint return an HTTP status code indicating that something isn't correct (like a 500) and then set up a Route53 Health Check with e-mail notifications (using SNS).

The basic procedure for configuring email alerts is pretty straightforward. Use this flowchart to get started.
If you want detailed instructions, this guide covers how to set up AWS email alerts upon resource status change and includes a few additional steps to refine the reports to be a bit more user-friendly and sent directly to a third-party messenger service.
The workflow will look like this:
Create Route53 Health Check;
Route53 initializes Health Checker Nodes in various regions;
Health Checkers ping the specified URL;
4a. Status is OK if TCP connection is established within 10 seconds and HTTP status code 2xx or 3xx is retrieved within 2 seconds;
OR
4b. Status is FAILURE otherwise: TCP connection fails, TCP connection times out, HTTP status code is 4xx, 5xx or page is too slow (yes, slow 200 response can cause failure);
Health Checker nodes will retry the endpoint as configured;
Cloud watch alarm is triggered on Health Check status change;
Alarm is delivered to AWS SNS topic
AWS SNS notifies topic subscribers
Advanced configuration may be applied to enhance notification contents and delivery method per above guide.
I work for the team that develops Axibase Time Series Database (atsd).

I would suggest a cloudwatch event that runs on a schedule you decide (i.e. every 5 minutes).
The event would call a lambda function, which would make the /getStatus call and decide if an email needs to be sent - if it does, I would further suggest AWS SES to send a custom formatted email with the appropriate alerts to the person(s) that are supposed to get them.
Using the above tools would be 'serverless', and cost very little to nothing and have the benefit of not running on a instance you have to worry about.

Related

SendGrid Email Activity API rate limit

In testing code that uses the SendGrid Email Activity API, I have received "too many messages" errors. I have examined the "rate limit" response headers and it appears that I am being limited to 10 requests per 5 minute block in the day. That is, the first 5 minutes of every hour can have 10 requests, the next 5 minutes can have 10 requests, etc.
I asked SendGrid support about this. The first response was pretty generic, but seems to indicate that the threshold is correct and says I really should be using webhooks to get the status. I haven't found anything in the documentation saying this and I haven't seen anything the specifies what the rate limits are.
For those of you using the Email Activity API, are you limited to 10 requests per 5 minutes? If yes, what do you do with the API?

Here's an snippet of what I ended up using with requests, tenacity and ratelimit:
from ratelimit import limits, sleep_and_retry
import requests
import tenacity
#sleep_and_retry
#limits(calls=2, period=60)
#tenacity.retry(
retry=tenacity.retry_if_exception_type(requests.exceptions.HTTPError),
stop=tenacity.stop_after_attempt(10),
wait=tenacity.wait.wait_fixed(60),
)
def _call_api(headers, params):
response = requests.get(
"https://api.sendgrid.com/v3/messages",
json={},
headers={},
params={},
)
try:
response.raise_for_status()
except requests.exceptions.HTTPError:
logger.info(f"Request failed {response.headers}, retrying in 1 minute")
raise
return response

I received a response from SendGrid support that says:
Your findings are correct in that we do limit this endpoint to 10 requests per 5 minutes. This is a hard limit that we do not have the means of raising. The Email Activity Feed as well as the Email Activity API endpoint are meant for troubleshooting specific issues and attaining detailed message metadata.

I previously found the rate limit to be 10/5min but it appears that SendGrid have changed the rate limit to 2 requests every 60 seconds sometime in the past week. Can anyone confirm this?
I'm using the webhook to report non-delivery back to my application but I also need to use the activity API to resolve async bounce notifications. Async bounces are when a destination mail server accepts a message during the smtp session but subsequently sends a bounce notification email. When this occurs, SendGrid do not provide the detail of the bounced message in the webhook and the message is incorrectly reported as delivered in the SendGrid app. When asked, they respond that there's nothing they can do about it, even though I have explained to them how I use their activity api to resolve this.
I pay extra to use the activity API to fix a problem that they should address themselves, so I'm very frustrated that they apply such restrictive rate limits, then change them without notice.

Firestore: "Exceeded quota for veryifying passwords"?

Hi I got this error in one of my ETE tests which exercises login functionality and start up behavior for my angular app.
The appears to be triggered by logging in using
await this.angularFireAuth.auth.signInWithEmailAndPassword(uname, pw);
where angularFireAuth is an injected instance of AngularFireAuthfrom '#angular/fire/auth';
I checked the Firestore quotas here but I can't find a reference to a quota for verifying passwords. Can anybody point me to what the quota is?
The console error reported looks like this:
zone-evergreen.js:659 Unhandled Promise rejection: Exceeded quota for verifying passwords. ; Zone: ProxyZone ; Task: Promise.then ; Value: u
The problem resolves after a few minutes and then test runs as expected.

I have found the message you are receiving being handled in this github thread.
Here are some of the important comments from the thread:
For the error you are facing "Exceeded quota for verifying passwords", this usually happens when one sends requests for verifying passwords or password login requests too many times at once (more than 20 requests per second per IP address or 25 requests per 10 min per account). When we get a huge amount of requests in a short period of time, the limit is applied automatically to protect our servers.
This is an internal quota (regardless of pricing plans) enforced by Firebase Authentication to prevent abuse when making authentication requests, for this reason the quota can change without notice.
In order to avoid triggering this alert, you can use a different IP address or
back off the number of requests per minute to something like 10-20, to avoid triggering the automated abuse detection.
If you are sending too many requests in a short period of time from the same IP address, then there is an expectation that you will get throttled at some point. This may prevent you from getting successful integration tests but there is a security benefit that comes with that. The easier it is for you to test, the easier it is for malicious scripts to be written too against your project. We have similar integration tests in other firebase auth libraries (client and admin) and we try to work with the limit.
If you have a legitimate need to increase the limit, then you can file a bug with support and make a case for that. You could even file for a feature request to whitelist calls from certain IP addresses.

How to send notification upon completion of CloudFormation template

We have a CloudFormation stack that we want to provide to our clients. When they run the stack, we want to receive some output values directly, i.e. we don't want them to need them to send us the output. Our first thought was to use SNS and the notification capabilities of CF but it seems that the topic must be in the account running the template and can't be in another account. We also considered subscribing to the existing SNS topic as part of the template but that doesn't get a message sent.
We realize that CF is a resource creation tool but we think there must be a way to get the info relayed to us automatically. Doesn't have to be SNS. Any ideas on how we might be able to do this?

Update your CF script to contain a lambda and cloud watch rule which runs every 5 minutes on a cron.
Give the lambda IAM permissions to query the stack/get any output values you require.
When the lambda triggers you query the data you need and can send it to yourself however you see fit. E.g http POST to an API you own.
To finish up your lambda should call the cloud watch API to disable the cloud watch rule so this code doesn't run again.
You should consider if all this offsets the effort of saying to your client "please send us details of x y z". If you have 10 clients, probably not, if you have 1000 clients then possibly.

What to do if a RESTful api is only partly successful

In our design we have something of a paradox. We have a database of projects. Each project has a status. We have a REST api to change a project from “Ready” status to “Cleanup” status. Two things must happen.
update the status in the database
send out an email to the approvers
Currently RESTful api does 1, and if that is successful, do 2.
But sometimes the email fails to send. But since (1) is already committed, it is not possible to rollback.
I don't want to send the email prior to commit, because I want to make sure the commit is successful before sending the email.
I thought about undoing step 1, but that is very hard. The status change involves adding new records to the history table, so I need to delete them. And if another person make other changes concurrently, the undo might get messed up.
So what can I do? If (2) fails, should I return “200 OK” to the client?
Seems like the best option is to return “500 Server Error” with error message that says “The project status was changed. However, sending the email to the approvers failed. Please take appropriate action.”
Perhaps I should not try to do 1 + 2 in a single operation? But that just puts the burden on the client, which is worse!

Just some random thoughts:
You can have a notification sent status flag along with a datetime of submission. When an email is successful then it flips, if not then it stays. When changes are submitted then your code iterates through ALL unsent notifications and tries to send. No idea what backend db you are suing but I believe many have the functionality to send emails as well. You could have a scheduled Job (SQL Server Agent for MSSQL) that runs hourly and tries to send if the datetime of the submission is lapsed a certain amount or starts setting off alarms if it fails as well.
If ti is that insanely important then maybe you could integrate a third party service such as sendgrid to run as a backup sending mech. That of course would be more $$ though...
Traditionally I've always separated functions like this into a backend worker process that handles this kind of administrative tasking stuff across many different applications. Some notifications get sent out every morning. Some get sent out every 15 minutes. Some are weekly summaries. If I run into a crash and burn then I light up the event log and we are (lucky/unlucky) enough to have server monitoring tools that alert us on specified application events.

SNMP Messages for failed logon with a PostgreSQL Log

Currently my application runs and inserts events into a protected PostgreSQL DB. That's cool and it allows for audit of user login and such.
What I would like to do is be able to take failed login events after they reach a certain threshold and report those via SNMP Message to another service (like a snmp server). I just can't seem to wrap my head around how.
I thought of maybe using POST to a failed page and inside of that PHP script a system to post to PostgreSQL and query events by user by time but it seems brutal. Maybe Python? I have options but I can't think of a good implementation. Help?

I would suggest following approach:
on insert trigger on table where login attempts were logged to check number of attempts and then
when login attempts threshold achieved send NOTIFY notification
finally external service LISTEN for notification and
upon it receives one send SNMP message broadcast
For more info see:
NOTIFY and LISTEN

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse