Is there a response code to mark a Google Cloud Task as permanently failed so it won't retry? - google-cloud-firestore

I have a Firebase HTTPS function that sends timed messages and is triggered by Google Cloud Tasks.
According to the Cloud Tasks documentation, any response code outside the 200 range is seen as a failure and will trigger a retry.
This function needs to scale to millions of daily messages, so we need to avoid retrying messages that have a permanent failure (the person opted out, etc).
Note: This is especially important in this example because each task needs to look up the latest information before processing, adding 2-10 Firestore reads to each attempt. We can't send this info in the payload because it might change between the time the message was queued and it is processed.
Its easy to delete the task using the cloud task API, but I was wondering if there is any HTTP response code (or header) that can mark these tasks as permanently failed (400 bad request for example) and not to retry them.

Only the HTTP code 2XX (from 200 to 299) are considered as a task completion and stops the retries.
All other return code are considered as a failure and imply a retry.
Note: 429 (and 503 for App Engine task queue) throttle the retries on the queue (to prevent service congestion).
If you want to stop the retry mechanism by Cloud Task, return a 2XX code. That's the only way.
You can imagine to return 299 and to plug Error Reporting alert on this specific code to track them and be alerted on these cases

Related

Failure events behavoir for DISCONNECT Intent

I'm implementing a Handler for the DISCONNECT Intent, and by reading the online documentation https://developers.google.com/assistant/smarthome/reference/intent/disconnect, I see that it is no such response format and it is not a reference in case of fail event, when for example we return an error in event that we didn't properly handle the disconnect or Google API did not reach our server. Does Google Assistant implement some sort of automatic retry logic so that in event of error it automatically resends the requests for a certain amount time. In other words is it a way to tell to the google assistant to retry or return some error to the user that made the DISCONNECT request?
There is no response format for the DISCONNECT intent in the https://developers.google.com/assistant/smarthome/reference/intent/disconnect documentation because nothing is returned when the action.devices.DISCONNECT intent is executed (eg. Google does not retry based on your response status code). You can find more information on this at https://developers.google.com/assistant/smarthome/develop/process-intents#unlink
After an action.devices.DISCONNECT intent is sent, the Assistant stops sending any more intents for the user who has just been disconnected, and the associated cloud service also stops calling Home Graph APIs (Request Sync and Report State) for their devices and a SYNC request is triggered.

Firestore: "Exceeded quota for veryifying passwords"?

Hi I got this error in one of my ETE tests which exercises login functionality and start up behavior for my angular app.
The appears to be triggered by logging in using
await this.angularFireAuth.auth.signInWithEmailAndPassword(uname, pw);
where angularFireAuth is an injected instance of AngularFireAuthfrom '#angular/fire/auth';
I checked the Firestore quotas here but I can't find a reference to a quota for verifying passwords. Can anybody point me to what the quota is?
The console error reported looks like this:
zone-evergreen.js:659 Unhandled Promise rejection: Exceeded quota for verifying passwords. ; Zone: ProxyZone ; Task: Promise.then ; Value: u
The problem resolves after a few minutes and then test runs as expected.
I have found the message you are receiving being handled in this github thread.
Here are some of the important comments from the thread:
For the error you are facing "Exceeded quota for verifying passwords", this usually happens when one sends requests for verifying passwords or password login requests too many times at once (more than 20 requests per second per IP address or 25 requests per 10 min per account). When we get a huge amount of requests in a short period of time, the limit is applied automatically to protect our servers.
This is an internal quota (regardless of pricing plans) enforced by Firebase Authentication to prevent abuse when making authentication requests, for this reason the quota can change without notice.
In order to avoid triggering this alert, you can use a different IP address or
back off the number of requests per minute to something like 10-20, to avoid triggering the automated abuse detection.
If you are sending too many requests in a short period of time from the same IP address, then there is an expectation that you will get throttled at some point. This may prevent you from getting successful integration tests but there is a security benefit that comes with that. The easier it is for you to test, the easier it is for malicious scripts to be written too against your project. We have similar integration tests in other firebase auth libraries (client and admin) and we try to work with the limit.
If you have a legitimate need to increase the limit, then you can file a bug with support and make a case for that. You could even file for a feature request to whitelist calls from certain IP addresses.

Email alert on AWS for webservice health

I have deployed a webservice on AWS EC2 instances.
I have also implemented a rest call /getStatuswhich returns status of modules in my service in JSON format like Connection status of DB, ActiveMQ cache status etc.
I want a way to creat automatic email trigger which will send mail when there is any issue found in response of /getStatus rest call.
I am looking if its possible using cloudwatch but any other sugestions are welcome
One solution is to make the endpoint return an HTTP status code indicating that something isn't correct (like a 500) and then set up a Route53 Health Check with e-mail notifications (using SNS).
The basic procedure for configuring email alerts is pretty straightforward. Use this flowchart to get started.
If you want detailed instructions, this guide covers how to set up AWS email alerts upon resource status change and includes a few additional steps to refine the reports to be a bit more user-friendly and sent directly to a third-party messenger service.
The workflow will look like this:
Create Route53 Health Check;
Route53 initializes Health Checker Nodes in various regions;
Health Checkers ping the specified URL;
4a. Status is OK if TCP connection is established within 10 seconds and HTTP status code 2xx or 3xx is retrieved within 2 seconds;
OR
4b. Status is FAILURE otherwise: TCP connection fails, TCP connection times out, HTTP status code is 4xx, 5xx or page is too slow (yes, slow 200 response can cause failure);
Health Checker nodes will retry the endpoint as configured;
Cloud watch alarm is triggered on Health Check status change;
Alarm is delivered to AWS SNS topic
AWS SNS notifies topic subscribers
Advanced configuration may be applied to enhance notification contents and delivery method per above guide.
I work for the team that develops Axibase Time Series Database (atsd).
I would suggest a cloudwatch event that runs on a schedule you decide (i.e. every 5 minutes).
The event would call a lambda function, which would make the /getStatus call and decide if an email needs to be sent - if it does, I would further suggest AWS SES to send a custom formatted email with the appropriate alerts to the person(s) that are supposed to get them.
Using the above tools would be 'serverless', and cost very little to nothing and have the benefit of not running on a instance you have to worry about.

What is going wrong in this SIP call? Multiple NOTIFY messages in a row before RTP established

the long string of NOTIFY messages happen after the called number answers. and after about 20-30 seconds the 503 happens and then the call connects fine with audio.
If that trace is for a single call it's an incredibly complex one. After spending a bit of time looking over it I don't think it is for a single call and instead there are a few different calls mixed up in it. It's complicated by the fact that 10.10.20.1 is a Back to Back User Agent (B2BUA) and is initiating its on calls in response to different events.
As to your question about the NOTIFY request it's originally generated by the UAC at 10.10.10.3 as part of what appears to be an attended transfer. The REFER request is the start of the transfer. An implicit subscription, which is what the NOTIFY request is part of, gets created for a REFER transaction (see https://www.rfc-editor.org/rfc/rfc3515 and also see https://www.rfc-editor.org/rfc/rfc4488 which deals with suppressing the implicit transaction).
For an attended transfer the NOTIFY request allows a call leg end point to indicate that the transfer has been processed successfully. In this case it looks like the user agent at 10.10.10.3 isn't happy to accept the transfer until it gets a response to its NOTIFY request. This is unusual behaviour as typically the NOTIFY requests are for just that, notifying agents of events not controlling call flow. Once 10.10.10.3 gets the 503 response to its NOTIFY request it finally starts sending the RTP to 10.10.20.4. It mustn't care what the response is as 503 is an error condition and would usually result in whatever was waiting for it to fail.

Cancel gwt rpc call

In this example there is a pretty description of how to make a timeout logic using a Timer#schedule. But there is a pitfall there. We have 2 rpc requests: first makes a lot of computation on server(or maybe retrieving a large amount of data from database) and second a tiny request that returns results immediately. If we make first request, we will not recieve results immediately, instead we will have a timeout and after timeout we make the second tiny request and then abortFlag from example will be true, so we can retrieve the results of second request, but also we can retrieve the results of first request that was timed out before(because the AsyncCallback object of first call was not destroyed).
So we need some kind of cancelling the first rpc call after timeout occurs. how can I do this?
Let me give you an analogy.
You, the boss, made a call to a supplier, to get some product info. Supplier say they need to call you back because the info would take some time to be gathered. So, you gave them the contact of your foreman.
Your foreman waits for the call. Then you told your foreman to cancel the info request if it takes more than 30 minutes.
Your foreman thinks you are bonkers because he cannot cancel the request, because he does not have an account that gives him privilege to access the supplier's ordering system.
So, your foreman simply ignores any response from the supplier after 30 minutes. Your ingenious foreman sets up a timer in his phone that ignores the call from the supplier after 30 minutes. Even if you killed your foreman, cut off all communication links, the vendor would still be busy servicing your request.
There is nothing on the GWT client-side to cancel. The callback is merely a javascript object waiting to be invoked.
To cancel the call, you need to tell the server-side to stop wasting cpu resources (if that is your concern). Your server-side must be programmed to provide a service API which when invoked would cancel the job and return immediately to trigger your GWT callback.
You can refresh the page, and that would discard the page request and close the socket, but the server side would still be running. And when the server side completes its tasks and tries to perform a http response, it would fail, saying in the server logs that it had lost the client socket.
It is a very straight forward piece of reasoning.
Therefore, it falls into the design of your servlet/service, how a previous request can be identified by a subsequent request.
Cascaded Callbacks
If request 2 is dependent on the status of request 1, you should perform a cascaded callback. If request 2 is to be run on success then, you should place request 2 into the onFailure block of the callback. Rather than submitting the two requests one after another.
Otherwise, your timer should trigger request 2, and request 2 would have two responsibilities:
tell the server to cancel the previous request
get the small piece of info