Cannot use sc_http_req_rate when the counter isn’t currently tracked - haproxy

As per documentation, sc_http_req_rate will return the rate of http requests from the stick table, but only from the currently tracked counters.
From testing, this means that if you are incrementing your counter using http-response rather than http-request, the field is unavailable. It cannot be used for rate limiting, or sent on to the backends.
A down to earth conseguence of this is that I cannot limit the number of bots generating too many 404 requests.
How can I load and use the stick table http rate during the request if it’s only tracked in the response?
Thank you :)

Related

Ajax polling vs SSE (performance on server side)

I'm curious about if there is some type of standard limit on when is better to use Ajax Polling instead of SSE, from a server side viewpoint.
1 request every second: I'm pretty sure is better SSE
1 request per minute: I'm pretty sure is better Ajax
But what about 1 request every 5 seconds? How can we calculate where is the limit frequency for Ajax or SSE?
No way is 1 request per minute always better for Ajax, so that assumption is flawed from the start. Any kind of frequent polling is nearly always a costly choice. It seems from our previous conversation in comments of another question that you start with a belief that an open TCP socket (whether SSE connection or webSocket connection) is somehow costly to server performance. An idle TCP connection takes zero CPU (maybe every once in a long while, a keep alive might be sent, but other than that, an idle socket does not use CPU). It does use a bit of server memory to handle the socket descriptor, but a highly tuned server can have 1,000,000 open sockets at once. So, your CPU usage is going to be more about how many connections are being established and what are they asking the server to do every time they are established than it is about how many open (and mostly idle) connections there are.
Remember, every http connection has to create a TCP socket (which is roundtrips between client/server), then send the http request, then get the http response, then close the socket. That's a lot of roundtrips of data to do every minute. If the connection is https, it's even more work and roundtrips to establish the connection because of the crypto layer and endpoint certification. So doing all that every minute for hundreds of thousands of clients seems like a massive waste of resources and bandwidth when you could create one SSE connection and the client just listen for data to stream from the server over that connection.
As I said in our earlier comment exchange on a different question, these types of questions are not really answerable in the abstract. You have to have specific requirements of both client and server and a specific understanding of the data being delivered and how urgent it is on the client and therefore a specific polling interval and a specific scale in order to begin to do some calculations or test harnesses to evaluate which might be the more desirable way to do things. There are simply too many variables to come up with a purely hypothetical answer. You have to define a scenario and then analyze different implementations for that specific scenario.
Number of requests per second is only one of many possible variables. For example, if most the time you poll there's actually nothing new, then that gives even more of an advantage to the SSE case because it would have nothing to do at all (zero load on the server other than a little bit of memory used for an open socket most of the time) whereas the polling creates continual load, even when nothing to do.
The #1 advantage to server push (whether implement with SSE or webSocket) is that the server only has to do anything with the client when there is actually pertinent data to send to that specific client. All the rest of the time, the socket is just sitting there idle (perhaps occasionally on a long interval, sending a keep-alive).
The #1 disadvantage to polling is that there may be lots of times that the client is polling the server and the server has to expend resources to deal with the polling request only to inform that client that it has nothing new.
How can we calculate where is the limit frequency for Ajax or SSE?
It's a pretty complicated process. Lots of variables in a specific scenario need to be defined. It's not as simple as just requests/sec. Then, you have to decide what you're attempting to measure or evaluate and at what scale? "Server performance" is the only thing you mention, but that has to be completely defined and different factors such as CPU usage and memory usage have to be weighted into whatever you're measuring or calculating. Then, you may even need to run some test harnesses if the calculations don't yield an obvious answer or if the decision is so critical that you want to verify your calculations with real metrics.
It sounds like you're looking for an answer like "at greater than x requests/min, you should use polling instead of SSE" and I don't think there is an answer that simple. It depends upon far more things than requests/min or requests/sec.
"Polling" incurs overhead on all parties. If you can avoid it, don't poll.
If SSE is an option, it might be a good choice. "It depends".
Q: What (if any) kind of "event(s)" will your app need to handle?

What does CoAP Max transaction in Contiki means?

I didn't get whether Max transactions refer to client side or server side of CoAP. For instance, if COAP_MAX_OPEN_TRANSACTIONS is 4. Does it mean that CoAP Client can send 4 parallel request to different servers or it means that CoAP Server can process max 4 requests in parallel.
Because from the code I see that it initiates a blocking request from the client side which will not allow looping for another transaction.
So, need clarification here. If multiple CoAP transactions possible from client side then please mention how. Thank you.
According to paper dunkels.com/adam/kovatsch11low-power.pdf
Section III-F CoAP Clients provide a blocking function call implemented with protothreads to issue a request. This linear programming model can also hide blockwise transfers, as it continues first when all data were received. So based on this I am guessing client can generate one transaction at a time and blocks to wait for ack (or timeout).
Here is code reference https://github.com/contiki-os/contiki/blob/master/apps/er-coap/er-coap-engine.c#L370.
Contrarily, Server can respond to multiple transactions simultaneously because there are transactions which wait for response (from say sensors) and need to save state. This is my understanding of the question posted. If I am wrong then please correct.
According to links:
https://github.com/contiki-os/contiki/blob/bc2e445817aa546c0bb93a9900093ec276005e2a/apps/er-coap/er-coap-conf.h#L51
https://github.com/contiki-ng/contiki-ng/wiki/Documentation:-CoAP#configuration
I guess it's just a max number of confirmable requests (which have not yet received an ACK) to be stored simultaneously for retransmission.
And it used for reserving memory for the max number of those requests:
https://github.com/contiki-os/contiki/blob/3f4436bac9a9f6da0df188372d4374693eab8a52/apps/er-coap/er-coap-transactions.c#L57
MEMB(transactions_memb, coap_transaction_t, COAP_MAX_OPEN_TRANSACTIONS);

lexicographicMode in asyncio bulkCmd

While using Asynchronous pysnmp bulkCmd with asyncio, if requested OID have many values (like 1.3.6.1.2.1.17.4.3.1.2 that show MAC address learned by Cisco switch) or if using several OIDs in one request, I have problem that total number of OIDs in response limited by MTU/MSS of network, which means that not all OIDs received.
This problem can control if using lexicographicMode in Synchronous bulkCmd, but Asynchronous bulkCmd generator haven`t that options.
It is possible to use getNext but it is significantly reducing performance because of increasing total number of packet (one request/response per OID).
Is there way to control that all "sub oid" received in response using Asynchronous bulkCmd ?
Could you use the maxRepetitions parameter to limit the number of response OIDs per each requested OID? It's 50 in this example.
I believe the lexicographicMode option is designed to stop walking the MIB once the initial prefix goes out of scope. So it has only indirect influence over message size what makes it unreliable for the purpose.

rate limit policy on queries to Azure Insights REST API for Events (Audit Logs)

I have some questions regarding Azure Insights REST Api for Events.
When I make HTTP request to Inisghts API for events, I receive the header "
x-ms-ratelimit-remaining-subscription-reads", with value "14999".
But next query in 1s returns me the same value of remaining reads.
I see there is some throttling policy there, but I would like to understand how it works and what is the correct way to deal with that.
In particular,
1) how many reads I am able to do per second?
2) if I exceed the whole remaining reads parameter, how much time should I wait before it will again be maximum?
3) is it decreased on every query attempt, despite of the $top parameter setted and how many results has been returned?
Thank you!
This article seems to have the responses you need.
To answer the questions based on it:
There is no limit to the number of requests per second, but you have 15k
requests/hour/subscription/region/instance of ARM region. Worst case scenario you will get throttled after 15k requests but you'd have to be extremely unlucky for that.
If you exceed the limit, you are
told how much you have to wait and you can integrate that logic by
looking at the Retry-After header. Happily, it's a matter of
seconds.
I believe the $top parameter doesn't affect the query since
no matter how many results are brought back, a paging request is
still just one request.
As for the fact that you get 14999 requests
remaining multiple times, as they say in their documentation it is
expected since an ARM region has multiple instances and each instance has
15k requests limit/subscription/hour. If you hit simultaneously and
you get the same number remaining, it just means that you were lucky
enough to hit different instances within the same ARM region.
1) how many reads I am able to do per second?
Based on the rate limits published here - https://azure.microsoft.com/en-in/documentation/articles/azure-subscription-service-limits/#subscription-limits, you can perform 15000 reads / hour (not sure it would translate to 4 reads / second).
2) if I exceed the whole remaining reads parameter, how much time
should I wait before it will again be maximum?
Given the rates are defined per hour, my guess would be to wait till next hour if you exhaust 15000 read request limit.
3) is it decreased on every query attempt, despite of the $top
parameter setted and how many results has been returned?
This is based on the number of API calls and not the amount of data returned. So I would say defining $top parameter should not have any impact on this.
When I make HTTP request to Inisghts API for events, I receive the
header " x-ms-ratelimit-remaining-subscription-reads", with value
"14999". But next query in 1s returns me the same value of remaining
reads.
I would assume there's some caching in play here. Is it the same request you're repeating or a different request all together?

Getting QuotaExceededException - What are the operation quota limitations for Azure Notification Hubs?

I was doing some latency/performance testing for sending push notifications with Azure Notification Hub by consecutively sending many notifications in a foreach loop. It worked fine for 100 "SendNotification" requests, altough it was relatively slow (14s), but I got a QuotaExceededException for 1000 requests in a row:
[QuotaExceededException: The remote server returned an error: (403)
Forbidden. The request was terminated because the namespace
pushnotification-testing is being throttled. Please wait 60 seconds
and try again. TrackingId:...
Even when I don't wait for 60 seconds as advised, I can again execute 100 consecutive requests, but 1000 requests in a row always fail... Anything slightly above 100 consecutive requests fails most of the time...
I couldn't find any documentation on these limitations. This should be documented somewhere, so I can be sure Azure Notification Hubs will fit my needs.
The answer to this question says
There is a throttling for CRUD operation's rate. Quotas depend on tire
your are but it is not going to be less then 2000 operations per
minute per namespace any way. If quota is exceed then service returns
403.
For me, it seems to be less then 2000 operations. By the way, I'm using "FREE" tier for testing, but I guess we would switch to "STANDARD" for production.
Has anyone similar experiences or knows where to look for more information?
In particular, what are the operation quota limitations per timefram for the different tiers of Azure Notification Hubs?
UPDATE1: It's weird, but I sending 1000 requests in parallel works most of the time, but consecutively it fails on the 101st request.
For my best knowledge for right now NH has following limitations on number of SENDS (not registrations) per namespace per minute per NH machine:
Free tire: 100
Basic tire: 900
Standard tire: 11500
Massive sending in parallel allows to send more because calls are very likely to be routed on different machines.