rate limit policy on queries to Azure Insights REST API for Events (Audit Logs) - rest

I have some questions regarding Azure Insights REST Api for Events.
When I make HTTP request to Inisghts API for events, I receive the header "
x-ms-ratelimit-remaining-subscription-reads", with value "14999".
But next query in 1s returns me the same value of remaining reads.
I see there is some throttling policy there, but I would like to understand how it works and what is the correct way to deal with that.
In particular,
1) how many reads I am able to do per second?
2) if I exceed the whole remaining reads parameter, how much time should I wait before it will again be maximum?
3) is it decreased on every query attempt, despite of the $top parameter setted and how many results has been returned?
Thank you!

This article seems to have the responses you need.
To answer the questions based on it:
There is no limit to the number of requests per second, but you have 15k
requests/hour/subscription/region/instance of ARM region. Worst case scenario you will get throttled after 15k requests but you'd have to be extremely unlucky for that.
If you exceed the limit, you are
told how much you have to wait and you can integrate that logic by
looking at the Retry-After header. Happily, it's a matter of
seconds.
I believe the $top parameter doesn't affect the query since
no matter how many results are brought back, a paging request is
still just one request.
As for the fact that you get 14999 requests
remaining multiple times, as they say in their documentation it is
expected since an ARM region has multiple instances and each instance has
15k requests limit/subscription/hour. If you hit simultaneously and
you get the same number remaining, it just means that you were lucky
enough to hit different instances within the same ARM region.

1) how many reads I am able to do per second?
Based on the rate limits published here - https://azure.microsoft.com/en-in/documentation/articles/azure-subscription-service-limits/#subscription-limits, you can perform 15000 reads / hour (not sure it would translate to 4 reads / second).
2) if I exceed the whole remaining reads parameter, how much time
should I wait before it will again be maximum?
Given the rates are defined per hour, my guess would be to wait till next hour if you exhaust 15000 read request limit.
3) is it decreased on every query attempt, despite of the $top
parameter setted and how many results has been returned?
This is based on the number of API calls and not the amount of data returned. So I would say defining $top parameter should not have any impact on this.
When I make HTTP request to Inisghts API for events, I receive the
header " x-ms-ratelimit-remaining-subscription-reads", with value
"14999". But next query in 1s returns me the same value of remaining
reads.
I would assume there's some caching in play here. Is it the same request you're repeating or a different request all together?

Related

Google Indexing API rateLimitExceeded

I have a Spring Batch Process which submits something around 5M urls to Google Indexing API. In the past, the process was segmented e parallelized int two threads by an attribute, one for the small segments and one for the bigger. From some days ago up to now, it was refactored to submit request as it come from a query response (sorted by its priority, ignoring the previous segmenting attribute, using a single thread to execute). After that refactoring, I started getting a "rateLimitExceed" error from Google API. I have (by contract) 5M request a day and I'm submitting batches of 500 urls a time. The average sending time is around 1.2 seconds for each 500 urls batch.
Does anybody know what may be causing this error?
I did not do the math, but if you are getting this exception, it means you are exceeding the limit. Depending on where you are doing the API call (ie in the item writer or in an item processor), you can do the math and delay the call as needed with a listener to not exceed the limit.
You can find a similar question/answer here: Spring batch writer throttling

PromQL Requests per minute

I'm trying to create a graph of total POST requests per minute in a graph, but there's this "ramp up" pattern that leads me to believe that I'm not getting the actual total of requests per minute, but getting an accumulative value.
Here is my query:
sum_over_time(django_http_responses_total_by_status_view_method_total{job="django-prod-app", method="POST", view="twitch_webhooks"}[1m])
Here are the "ramp up" patterns over 7days (drop offs indicating a reboot):
What leads me to believe my understanding of sum_over_time() is incorrect is because the existing webhooks should always exist. At the time of the most recent reboot, we have 72k webhook subscriptions, so it doesn't make sense for the value to climb over time, it would make more sense to see a large spike at the start for catching webhooks that were not captured during downtime.
Is this query correct for what I'm trying to achieve?
I am using django-prometheus for exporting.
You want increase rather than sum_over_time, as this is a counter.
If the django_http_responses_total_by_status_view_method_total metrics is a counter, then increase() function must be used for returning the number of requests during the last minute:
increase(django_http_responses_total_by_status_view_method_total[1m])
Note that increase() function in Prometheus can return fractional results even if django_http_responses_total_by_status_view_method_total metric contains only integer values. This is due to implementation details - see this comment and this article for details.
If the django_http_responses_total_by_status_view_method_total metric is a gauge, which shows the number of requests since the previous sample, then sum_over_time() function must be used for returning requests per last minute:
sum_over_time(django_http_responses_total_by_status_view_method_total[1m])

The fast way to execute rest requests that require incremented value (nonce)

I'm working with Rest Api that requires an incremented parameter to be sent with each request. I use unix miliseconds as nonce and originally naively sent requests one after another but even if I send one message before another, they can arrive in a reversed order which results in an error.
One solution could be sending the next request only after the previous one got back. But it would be too slow. I'm thinking about less strict solution like measuring latency over the last 10 requests and waiting for x% of latency before sending the next message. I feel like this problem should've been already solved but can't find any good reference. Would appreciate any advice.

Getting QuotaExceededException - What are the operation quota limitations for Azure Notification Hubs?

I was doing some latency/performance testing for sending push notifications with Azure Notification Hub by consecutively sending many notifications in a foreach loop. It worked fine for 100 "SendNotification" requests, altough it was relatively slow (14s), but I got a QuotaExceededException for 1000 requests in a row:
[QuotaExceededException: The remote server returned an error: (403)
Forbidden. The request was terminated because the namespace
pushnotification-testing is being throttled. Please wait 60 seconds
and try again. TrackingId:...
Even when I don't wait for 60 seconds as advised, I can again execute 100 consecutive requests, but 1000 requests in a row always fail... Anything slightly above 100 consecutive requests fails most of the time...
I couldn't find any documentation on these limitations. This should be documented somewhere, so I can be sure Azure Notification Hubs will fit my needs.
The answer to this question says
There is a throttling for CRUD operation's rate. Quotas depend on tire
your are but it is not going to be less then 2000 operations per
minute per namespace any way. If quota is exceed then service returns
403.
For me, it seems to be less then 2000 operations. By the way, I'm using "FREE" tier for testing, but I guess we would switch to "STANDARD" for production.
Has anyone similar experiences or knows where to look for more information?
In particular, what are the operation quota limitations per timefram for the different tiers of Azure Notification Hubs?
UPDATE1: It's weird, but I sending 1000 requests in parallel works most of the time, but consecutively it fails on the 101st request.
For my best knowledge for right now NH has following limitations on number of SENDS (not registrations) per namespace per minute per NH machine:
Free tire: 100
Basic tire: 900
Standard tire: 11500
Massive sending in parallel allows to send more because calls are very likely to be routed on different machines.

What does server throughput mean

If the throughput is increase how will be changed the response and request time?
If I have the data(request/min)?
JMeter's definition of throughput can be seen here: https://jmeter.apache.org/usermanual/glossary.html
Basically its a measure of how many requests that JMeter were able to send to your test site/application in one second. Or in another word the number of requests that your test site/application was able to receive from JMeter in one second. An increase in the throughput will mean your site/application was able to receive more requests per second while a decrease will mean a reduction in the number of request it handled per second.
The relationship between throughput with response/request time totally depends as ysth stated. I typically use this number to see the load of the server but run the test several times (30x min) and take the average.
There's not necessarily a relationship. Can you tell us anything more about why you want to know this, what you plan to do with the information, etc.? It may help get you an answer better suited to your needs.
After completion of the project development as a developer, we are responsible to test the performance of the application.
As part of performance testing, we have to check
1)Response time of application
2)bottle nack of application
3)Throughput of application
Throughput of application:-
In general 'Request capacity of application in a given time.'
As per Apache JMeter doc :-
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time).