I am writing a Compliance Integration using Python and the Facebook Graph API to search all user content in our Workplace community for given keywords. I have something that previously worked every time, however recently (over the last couple of days) one of the requests sent to Facebook will return a FacebookApiException with the error code 960 with a message "Request aborted. This could happen if a dependent request failed or the entire request timed out." after having already successfully received thousands of successful requests. This doesn't occur all the time, but more often than not it will fail.
{
"error": {
"message": "Request aborted. This could happen if a dependent request failed or the entire request timed out.",
"code": 960,
"type": "FacebookApiException",
"fbtrace_id": "B72L8jiCFZy"
}
}
For simplicity I haven't been using dependencies in my requests, so I can only think that it is timing out. My question is -- what is the timeout period for the Facebook Graph API? Is it timing out because I am taking too long to send a request, or is it timing out because the Facebook server is taking too long to respond to my request? Is there any way I can increase the timeout to stop the error message occurring?
TIA
This question is older, but in case anyone else is looking for an answer.
I can't answer what the timeout period is for the Facebook Graph Api, but I can point out a workaround for those who are running into timeout errors.
Facebook has documentation for how to deal with timouts:
https://developers.facebook.com/docs/graph-api/making-multiple-requests/#timeouts
Large or complex batches may timeout if it takes too long to complete all the requests within the batch. In such a circumstance, the result is a partially-completed batch. In partially-completed batches, responses from operations that complete successfully will look normal (see prior examples) whereas responses for operations that are not completed will be null.
The ordering of responses correspond with the ordering of operations in the request, so developers should process responses accordingly to determine which operations were successful and which should be retried in a subsequent operation.
So, according to their documentation, the response for a batch request that timed out should look something like this:
[
{ "code": 200,
"headers": [
{ "name":"Content-Type",
"value":"text/javascript; charset=UTF-8"}
],
"body":"{\"id\":\"…\"}"
},
null,null,null
]
Using their example, you should just need to re-queue the items in your batch request array that correspond with the null responses.
Related
I am currently managing two Google Analytics Management Accounts with many clients and view_ids on each one. The task is to request client data via the Google Analytics Reporting API (v4) and store them to a SQL Backend on a daily basis via an Airflow DAG-structure.
For the first account everything works fine.
Just recently I added the second account to the data request routine.
The problem is that even though both accounts are set to the same "USER-100s" quota limits, I keep getting this error for the newly added account:
googleapiclient.errors.HttpError: <HttpError 429 when requesting https://analyticsreporting.googleapis.com/v4/reports:batchGet?alt=json returned "Quota exceeded for quota group 'AnalyticsDefaultGroup' and limit 'USER-100s' of service 'analyticsreporting.googleapis.com' for consumer 'project_number:XXXXXXXXXXXX'.">
I already set the quota limit "User-100s" from 100 to the maximum of 1000, as recommended in the official Google guidelines (https://developers.google.com/analytics/devguides/config/mgmt/v3/limits-quotas)
Also I checked the Google API Console and the number of requests for my project number, but I never exceeded the 1000 requests per 100 seconds so far (see request history account 2), while the first account always works(see request history account 1). Still the above error appeared.
Also I could rule out the possibility that the 2nd account's clients simply have more data.
request history account 1
request history account 2
I am now down to a try-except loop that keeps on requesting until the data is eventually queried successfully, like
success = False
data = None
while not success:
try:
data = query_data() # trying to receive data from the API
if data:
success = True
except HttpError as e:
print(e)
This is not elegant at all and bad for maintaining (like integration tests). In addition, it is very time and resource intensive, because the loop might sometimes run indefinitely. It can only be a workaround for a short time.
This is especially frustrating, because the same implementation works with the first account, that makes more requests, but fails with the second account.
If you know any solution to this, I would be very happy to know.
Cheers Tobi
I know this question is here for a while, but let me try to help you. :)
There are 3 standard request limits:
50k per day per project
2k per 100 seconds per project
100 per 100 seconds per user
As you showed in your image (https://i.stack.imgur.com/Tp76P.png)
The quota group "AnalyticsDefaultGroup" refers to your API project and the user quota is included in this limit.
Per your description, you are hitting the user quota and that usually happens when you don't provide the userIP or quotaUser in your requests.
So there is to main points you have to handle, to prevent those errors:
Include the quotaUser with a unique string in every request;
Keep 1 request per second
By your code, I will presume that you are using the default Google API Client for Python (https://github.com/googleapis/google-api-python-client), which don't have a global way to define the quotaUser.
To include the quotaUser
analytics.reports().batchGet(
body={
'reportRequests': [{
'viewId': 'your_view_id',
'dateRanges': [{'startDate': '2020-01-01', 'endDate': 'today'}],
'pageSize': '1000',
'pageToken': pageToken,
'metrics': [],
'dimensions': []
}]
},
quotaUser='my-user-1'
).execute()
That will make to Google API register you request for that user, using 1 of the 100 user limit, and not the same for your whole project.
Limit 1 request per second
If you plan to make a lot of requests, I suggest including a delay between every request using:
time.sleep(1)
right after a request on the API. That way you can keep under 100 requests per 100 seconds.
I hoped I helped. :)
I have a RESTful API that is used by another internal application that post updates to it.
The problem is that some unexpected peaks occur and, during those times, a request might take longer than 60 seconds (the limit defined by the load balancer, which I cannot change) to respond, which causes a 504 Gateway Timeout error.
When the latter application gets such response, it will retry the request again after 10 minutes or so.
This caused some requests to be processed twice, because the first request was successful, but took more than 60 seconds.
So I decided to use Idempotency Keys in the requests to avoid this problem. The issue is that I don't know what I should return in this case.
Should I just stick with 200 OK? Should I return some 4xx code?
I'd say it highly depends if it is an error for you or not. But I'd say the exact response code is more a matter of taste rather than best practice. But since I guess you're rejecting the duplicated requests, you want to report an error code such as 409 Conflict:
Indicates that the request could not be processed because of conflict
in the current state of the resource, such as an edit conflict between
multiple simultaneous updates.
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_errors
Whenever a resource conflict would be caused by fulfilling the request. Duplicate entries and deleting root objects when cascade-delete is not supported are a couple of examples.
https://www.restapitutorial.com/httpstatuscodes.html
A potentially useful reference is RFC 5789, which describes the PATCH method. Obviously, you aren't doing a patch, but the error handling is analogous.
For instance, if you were sending a JSON Patch document, then you might be ensuring idempotent behavior by including a test operation that checks that the resource is in the expected initial state. After your operation, that check would presumably fail. In that case, the error handling section directs your attention to RFC 5789 -- section 2.2 outlines a number of different possible cases.
Another source of inspiration is to look at RFC 7232 which describes conditional requests. The section on If-Match includes this gem:
An origin server MUST NOT perform the requested method if a received If-Match condition evaluates to false; instead, the origin server MUST respond with either a) the 412 (Precondition Failed) status code or b) one of the 2xx (Successful) status codes if the origin server has verified that a state change is being requested and the final state is already reflected in the current state of the target resource (i.e., the change requested by the user agent has already succeeded, but the user agent might not be aware of it, perhaps because the prior response was lost or a compatible change was made by some other user agent).
From this, I infer that 200 is completely acceptable if you can determine that the work was already done successfully.
I am developing a webservice that alllows users to request validation reports. Report generation might take up to 20 hours per report. When a new validation request is posted, I return a 202 Accepted answer with Location set to a processing queue (e.g./queue/5) When the queue resource is polled some processing information is provided:
<queueResponse>
<status>QUEUED</status>
<queuePosition>1</queuePosition>
</queueResponse>
Once processing completes successfully and the queue is polled, a 303 see other will redirect to the created resource (at /reports/5 e.g.).
However if a processing error occurs on server, i simply return my queueResponse without redirect and status set to <status>ERROR</status>.
Is this the best way to comunicate a processing error to the client? Or should instead simply a 500 Internal Server Error returned when polling the queue for a failed validation task?.....
Your current solution is best. A 500 error for the queued process information would indicate that the request for that resource had failed, not the process it was reporting on.
postscript: If your API is still being defined, I would suggest FAILED instead of ERROR, as it sounds more permanent. Errors are potentially recoverable situations, failures are not.
I implemented a https/REST provider in node.js using express. The function is calling a webservice, transforming/enhancing data and returning transformed data as csv using response. Execution time of one get request is between 4 minutes 30 seconds and 5 minutes. I want to test the implementation by calling the url.
Problem:
execution in google chrome fails since it runs to long. No option to
increase the time out value.
execution in modzilla firefox:
network.http.response.timeout changed. Now the request is executed
over and over again. Looks like the response is ignored completely.
execution in postman: changed settings->general->XHR timeout in ms(...) .
Nevertheless execution stops every time after the same amount of seconds with
message: "Could not get any response" .
My question: which tool(s) can I use for reliable testing of long running http REST requests?
curl has a --max-time in seconds setting which should do what you want.
curl -m 330 http://you.url
But it might be worth creating a background job and polling for completion of the background job instead. HTTP isn't best suited to long running tasks.
I suggest you to use Socket IO to async response with pub/sub when the csv file is ready In the client send the request and put a timeout of 6 minutes for example, the server in the request return an ack to confirm the file process start, when the file is ready, return with Socket IO the file, Socket IO can be integrated with express
http://socket.io/
Do you have control over the server? If so, you should alter how it operates. Instead of the initial request expecting a response containing the answer, your API should emit a token (a URI) from where the status of the operation can be obtained. The status will either be "in progress" or "completed; here's your answer: ..."
You make the problem (the long-running operation) into its own first-class entity on your server.
We are using graph API to get number of shares for all post on each page of our client, running once per day, we use graph.facebook.com/post_id, but we offen get
(#613) Calls to stream have exceeded the rate of 600 calls per 600 seconds
I tried using batch request, it seems each request in the batch got counted for the limit. Any suggestions?
Here are our findings so far:
FQL stream table doesn't have a field for "shares".
Post insights have no metric matching the "#shares" as show on page wall.
Graph API call for post will reach limit quickly.
Make fewer calls - that's the only real answer here, assuming you've already taken other optimisations, like asking for multiple posts' details in a single call (via ?ids=X,Y,Z syntax mentioned on the homepage of the Graph API documentation)
Why does it need to be done 'once per day'? Why not spread the calls out over a few hours?
It doesn't matter if you request by batch, each item will still be counted as one hit and you will reach the same limit. It's indicated in the FB docs
https://developers.facebook.com/docs/graph-api/advanced/rate-limiting
You can try distributing your load by timeout or delay in your cron job or something. Or execute the first batch and the next batch in an hour is probably the safest.