What is retry policy for Apple Server-to-Server Notifications - app-store

I'm implementing server-side application that manages subscriptions for IOS application.
To control account state(subscription is active or not) on the backend I'm using Apple Server-to-Server Notifications
Documentation saying:
Respond to Server-to-Server Notifications Your server should send an
HTTP status code to indicate whether the server-to-server notification
post succeeded:
Send HTTP 200 if the post was successful. Your server is not required
to return a data value.
Send HTTP 50x or 40x to have the App Store retry the notification if
the post was not successful. The App Store makes several attempts to
retry the notification over a period of time but eventually stops
after continued failed attempts.
But it is not clear which exactly retry policy apple following in case of server error on my side.
I'm looking for the answer to the following questions
How many retries apple will do?
With which interval between retries?

The retry policy for App Store server notifications depends on the version of the server notification. It retries as follows:
For version 1 notifications, it retries three times; at 6, 24, and 48 hours after the previous attempt.
For version 2 notifications, it retries five times; at 1, 12, 24, 48, and 72 hours after the previous attempt.
See here for details.

Refer: https://developer.apple.com/documentation/appstoreservernotifications/your_server
Upon receiving a server notification, respond to the App Store with an HTTP status code of 200 if the post was successful. If the post was unsuccessful, send HTTP 50x or 40x to have the App Store retry the notification.
If the App Store server doesn’t receive a 200 response from your server after the initial notification attempt, it retries three times. App Store retries at 6, 24, and 48 hours after its initial attempt. While App Store server notifications report state changes in real-time, you can always initiate receipt validation to get an up-to-date receipt. For more information

As you said Apple does not provide a clear answer in the documentation. But, from the Apple WWDC 2019 conference video: https://developer.apple.com/videos/play/wwdc2019/302/?time=637
"However, should you not return a 200 response, we will retry up to
three times to resend the notification to you"
Some manual testing suggests that they retry message for one hour.

Apple will attempt to retry 3 times over 3 days.

Related

SendGrid Email Activity API rate limit

In testing code that uses the SendGrid Email Activity API, I have received "too many messages" errors. I have examined the "rate limit" response headers and it appears that I am being limited to 10 requests per 5 minute block in the day. That is, the first 5 minutes of every hour can have 10 requests, the next 5 minutes can have 10 requests, etc.
I asked SendGrid support about this. The first response was pretty generic, but seems to indicate that the threshold is correct and says I really should be using webhooks to get the status. I haven't found anything in the documentation saying this and I haven't seen anything the specifies what the rate limits are.
For those of you using the Email Activity API, are you limited to 10 requests per 5 minutes? If yes, what do you do with the API?
Here's an snippet of what I ended up using with requests, tenacity and ratelimit:
from ratelimit import limits, sleep_and_retry
import requests
import tenacity
#sleep_and_retry
#limits(calls=2, period=60)
#tenacity.retry(
retry=tenacity.retry_if_exception_type(requests.exceptions.HTTPError),
stop=tenacity.stop_after_attempt(10),
wait=tenacity.wait.wait_fixed(60),
)
def _call_api(headers, params):
response = requests.get(
"https://api.sendgrid.com/v3/messages",
json={},
headers={},
params={},
)
try:
response.raise_for_status()
except requests.exceptions.HTTPError:
logger.info(f"Request failed {response.headers}, retrying in 1 minute")
raise
return response
I received a response from SendGrid support that says:
Your findings are correct in that we do limit this endpoint to 10 requests per 5 minutes. This is a hard limit that we do not have the means of raising. The Email Activity Feed as well as the Email Activity API endpoint are meant for troubleshooting specific issues and attaining detailed message metadata.
I previously found the rate limit to be 10/5min but it appears that SendGrid have changed the rate limit to 2 requests every 60 seconds sometime in the past week. Can anyone confirm this?
I'm using the webhook to report non-delivery back to my application but I also need to use the activity API to resolve async bounce notifications. Async bounces are when a destination mail server accepts a message during the smtp session but subsequently sends a bounce notification email. When this occurs, SendGrid do not provide the detail of the bounced message in the webhook and the message is incorrectly reported as delivered in the SendGrid app. When asked, they respond that there's nothing they can do about it, even though I have explained to them how I use their activity api to resolve this.
I pay extra to use the activity API to fix a problem that they should address themselves, so I'm very frustrated that they apply such restrictive rate limits, then change them without notice.

Is there a response code to mark a Google Cloud Task as permanently failed so it won't retry?

I have a Firebase HTTPS function that sends timed messages and is triggered by Google Cloud Tasks.
According to the Cloud Tasks documentation, any response code outside the 200 range is seen as a failure and will trigger a retry.
This function needs to scale to millions of daily messages, so we need to avoid retrying messages that have a permanent failure (the person opted out, etc).
Note: This is especially important in this example because each task needs to look up the latest information before processing, adding 2-10 Firestore reads to each attempt. We can't send this info in the payload because it might change between the time the message was queued and it is processed.
Its easy to delete the task using the cloud task API, but I was wondering if there is any HTTP response code (or header) that can mark these tasks as permanently failed (400 bad request for example) and not to retry them.
Only the HTTP code 2XX (from 200 to 299) are considered as a task completion and stops the retries.
All other return code are considered as a failure and imply a retry.
Note: 429 (and 503 for App Engine task queue) throttle the retries on the queue (to prevent service congestion).
If you want to stop the retry mechanism by Cloud Task, return a 2XX code. That's the only way.
You can imagine to return 299 and to plug Error Reporting alert on this specific code to track them and be alerted on these cases

What is max number of retry attempts for Twilio sms callback?

Twilio has provision for configuring callback URL while sending SMS, which is notified of events relating to changes in the delivery state of SMS.
What happens if my application misses one of these callback events? Say for example when my server is down and the callback request encounters a 502 or 500 response.
Does Twilio retry the callback?
If yes, how many attempts are made before abandoning the event notification?
Hope this thread isn't SUPER dead...
I set out to solve this problem myself by making twiliq.com. Disclosure: I'm the dude that made this thing.
You set it as your backup URL endpoint and it'll replay the message as often as you configure until your server recovers.
Twilio developer evangelist here.
Twilio webhooks (for SMS or phone calls) do not make retry attempts to the same URL if your application fails to respond with a 200 response.
However, you can supply a fallback URL that Twilio will request with the same parameters if your primary URL fails. We recommend that this fallback URL is not part of the same application so that if your main application is down, you can recover and continue the conversation, save the errors for later or return an error message to your user.
There is more detail on how best to use fallback URLs on the Twilio site.
Since an answer was posted for this Twilio has added support for retry attempts on webhooks. By default it will retry once on TCP connect or TLS handshake failures, but the type of failures it retries for can be adjusted and the number of retries can be set anywhere between and including 0 and 5.
Documentation can be found here:
https://www.twilio.com/docs/usage/webhooks/webhooks-connection-overrides

iPhone: Push notification reliability on bulk devices

I have been using https://github.com/Redth/APNS-Sharp to send push notification message to all devices where my iPhone App installed. It's working very inconsistent way!
How does this issue started?
We have an iPhone App with around 500 users. We noticed that most of users are not receiving notification message! Further debugging on real time, I have noticed following sequence of events.
.....
10:37:33 AM - Notification Queued!
10:37:33 AM - Notification Queued!
10:37:33 AM - Notification Queued!
10:37:33 AM - Notification Queued!
10:37:36 - Connecting...
10:37:36 - Connected...
10:37:36 - Notification Success
10:37:36 - Notification Success
10:37:36 - Notification Success
...
10:37:36 - Error: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host.
10:37:39 - Connecting...
10:37:40 - Connected...
10:37:40 - Notification Success
10:37:40 - Notification Success
....
What I have done?
I have created testing iPhone App using Ad-Hoc production certificate and installed it on 5 devices. I tried sending multiple messages at the same time to all these devices. I noticed totally inconsistent behaviour in receiving messages. Sometimes all 5 devices received message instantly. Out of 5, 3 devices receive message almost instantly, and out of other two devices, either of them receive message instantly sometimes and other doesn't receive all messages but the last message. Sometimes it doesn't receive message at all!
I also tried sending message to individual device at a time and noticed that once it has started receiving the message then all message comes instantly and sometimes just receive last message after long duration(about 20 minutes).
What I have verified?
It uses the same connection to send all messages so it doesn't open multiple connections.
Using the correct certificate and push notification server.
Ran Feedback service few times but it didn't return any device IDs.
Does anyone else noticed this behaviour? What could be issue when you send message to multiple devices? Is there anything else I could do to make Push notification reliable?
Thanks.
Try UrbanAirship. I have found it to be very consistent because of its cloud approach (having deployed several apps). A lot of major players are using it too (tapulous etc.). Its always better to delegate the headache of such things to the experts :). + Its free.
I have logged this bug to Apple and got response from them. It seems they have fixed client side things related to Push Notifications in iOS 5.0. Also as Push Notification is not guaranteed, this inconsistent behavior has been happening. If someone interested then Bug ID# 10333505

Is Apple's push notification service reliable?

I have an iOS app using push notification but once in a while I'm not getting a notification on my device when I expect to receive one. I would receive all the subsequent notifications. I confirmed with my backend to make sure that all the notifications were sent successfully.
So my question is: is APNs nearly 100% reliable or should I just expect to miss some notifications here and there because of intermittent 3G/wifi connection?
I would think that APNs works as a queueing system and retry if it wasn't successful within the first few times.
The APN service will queue messages up -- but Apple doesn't guarantee delivery of all messages. Only the last message from an application will be kept in the queue when the user is offline. Additionally, old messages may be deleted.
Local and Push Notification Programming Guide
Apple Push Notification Service
includes a default Quality of Service
(QoS) component that performs a
store-and-forward function. If APNs
attempts to deliver a notification but
the device is offline, the QoS stores
the notification. It retains only one
notification per application on a
device: the last notification received
from a provider for that application.
When the offline device later
reconnects, the QoS forwards the
stored notification to the device. The
QoS retains a notification for a
limited period before deleting it.
I have an azure website (and mobile service, service bus, db, active directory etc) that sends push notifications to a xamarin app on a windows phone and an iphone. The first notification is received by both. The second notification is only received by the windows phone, it doesnt make it to the iphone. If I send another notification, it is received by both. So I investigated the behaviour at bit more and found that if I machine gun a series of notifications (hand typed one per 2 seconds) the windows phone received them all but the iphone only receives the first one. But if I wait a while and send a notification it is received by both devices. The next test is to see if the notifications are always received with a 5 minute gap. Sent two messages with a five minutes gap in between, both windows phone and iphone received both notifications.